0

I want to replace the <h1> tag of a html page.

But the content of the heading can be HTML (not just a string).

I want to insert foo <b>bold</b> bar

input:

start 
<h1 class="myclass">bar <i>italic</i></h1>
end

Desired output:

start 
<h1 class="myclass">foo <b>bold</b> bar</h1>
end

How to solve this with Python?

asked Jul 9, 2021 at 10:13

2 Answers 2

0

using htql:

page="""start 
<h1 class="myclass">bar <i>italic</i></h1>
end
"""
import htql
x = htql.query(page, "<h1>:tx &replace('foo <b>bold</b> bar') ")[0][0]

You get:

>>> x
'start \n<h1 class="myclass">foo <b>bold</b> bar</h1>\nend\n'
answered Jul 17, 2021 at 15:04
Sign up to request clarification or add additional context in comments.

Comments

-1
parser = HTMLParser(namespaceHTMLElements=False)
etree = parser.parse('start <h1 class="myclass">bar <i>italic</i></h1> end')
for h1 in etree.findall('.//h1'):
 for sub in h1:
 h1.remove(sub)
 html = parser.parse('foo <b>bold</b> bar')
 body = html.find('.//body')
 for sub in body:
 h1.append(sub)
 h1.text = body.text
print(ElementTree.tostring(etree))
answered Jul 9, 2021 at 10:13

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.