lxml to parse html

Stefan Behnel stefan_ml at behnel.de
Mon Jan 23 03:56:11 EST 2012


contro opinion, 23.01.2012 08:34:
> import lxml.html
> myxml='''
> <cooperate>
> <job DecreaseHour="1" table="tpa_radio_sum">
> </job>
>> <job DecreaseHour="2"
> table="tpa_radio_sum">
> </job>
>>> <job DecreaseHour="3" table="tpa_radio_sum">
> </job>
> </cooperate>
> '''
> root=lxml.html.fromstring(myxml)
> nodes1=root.xpath('//job[@DecreaseHour="1"]')
> nodes2=root.xpath('//job[@ne_type="101"]')
> print "nodes1=",nodes1
> print "nodes2=",nodes2
>> what i get is:
> nodes1=[] and
> nodes2=[<Element job at 0x13636f0>]
> why nodes1 is []?nodes2=[<Element job at 0x13636f0>],

Not on my side. I get two empty lists.
> it is so strange thing?why ?

The really strange thing that I don't understand is why you would use an
HTML parser to parse an XML document. You should use lxml.etree instead.
Stefan


More information about the Python-list mailing list

AltStyle によって変換されたページ (->オリジナル) /