Home Trees Indices Help
lxml API
Package lxml :: Package html :: Module soupparser
[]

Module soupparser

source code

External interface to the BeautifulSoup HTML parser.
Classes [hide private]
_PseudoTag
Functions [hide private]
fromstring(data, beautifulsoup=None, makeelement=None, **bsargs)
Parse a string of HTML data into an Element tree using the BeautifulSoup parser. source code
parse(file, beautifulsoup=None, makeelement=None, **bsargs)
Parse a file into an ElemenTree using the BeautifulSoup parser. source code
convert_tree(beautiful_soup_tree, makeelement=None)
Convert a BeautifulSoup tree to a list of Element trees. source code
_parse(source, beautifulsoup, makeelement, **bsargs) source code
_parse_doctype_declaration(...)
match(string[, pos[, endpos]]) --> match object or None. Matches zero or more characters at the beginning of the string source code
_convert_tree(beautiful_soup_tree, makeelement) source code
_init_node_converters(makeelement) source code
handle_entities(...)
sub(repl, string[, count = 0]) --> newstring Return the string obtained by replacing the leftmost non-overlapping occurrences of pattern in string by the replacement repl. source code
character
unichr(i)
Return a string of one character with ordinal i; 0 <= i < 256.
unescape(string) source code
Variables [hide private]
_DECLARATION_OR_DOCTYPE = (<class 'bs4.element.Declaration'>, ... __package__ = 'lxml.html'
Function Details [hide private]

fromstring(data, beautifulsoup=None, makeelement=None, **bsargs)

source code

Parse a string of HTML data into an Element tree using the BeautifulSoup parser.

Returns the root <html> Element of the tree.

You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard BeautifulSoup class and the default factory of lxml.html are used.

parse(file, beautifulsoup=None, makeelement=None, **bsargs)

source code

Parse a file into an ElemenTree using the BeautifulSoup parser.

You can pass a different BeautifulSoup parser through the beautifulsoup keyword, and a diffent Element factory function through the makeelement keyword. By default, the standard BeautifulSoup class and the default factory of lxml.html are used.

convert_tree(beautiful_soup_tree, makeelement=None)

source code

Convert a BeautifulSoup tree to a list of Element trees.

Returns a list instead of a single root Element to support HTML-like soup with more than one root element.

You can pass a different Element factory through the makeelement keyword.


Variables Details [hide private]

_DECLARATION_OR_DOCTYPE

Value:
(<class 'bs4.element.Declaration'>, <class 'bs4.element.Doctype'>)

Home Trees Indices Help
lxml API
Generated by Epydoc 3.0.1 on Thu Jul 9 18:29:53 2020 http://epydoc.sourceforge.net

AltStyle によって変換されたページ (->オリジナル) /