2

I want to parse an HTML table into a 2d array (rows and cols) in python using HTMLParser (only. Don't want to use BeautifulSoup and other non-standard libraries)

This is for a personal project, doing this for fun :P

Anyway, here's my code. Its giving me a really messed up error - it says

asked Mar 5, 2012 at 10:02

1 Answer 1

1

I haven't checked what you exactly want to do, but you assign a string to self.txt and then try to use it as a list.

In the constructor, you initialize self.txt with an empty list :

def __init__(self):
...
self.txt = []
...

and then in the handle_data method :

def handle_data(self, text):
 if (len(self.txt) > 0 ) :
 self.txt.append(text + " ") # <-- Here you consider self.txt is a list
 if (self.in_table == 1 and self.in_th == 0):
 self.txt = text.lstrip() # <-- Here you **assign a string** to self.txt
answered Mar 5, 2012 at 11:33
Sign up to request clarification or add additional context in comments.

1 Comment

Could you check what I did though? I'm trying to get done with this today... Basically I'm trying to add the dehtml'ed data to a new list and then joining the list elements to create one big blob of dehtml'ed text.. That's why self.txt is a list

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.