19. Structured Markup Processing ToolsΒΆ
Python supports a variety of modules to work with various forms of structured data markup. This includes modules to work with the Standard Generalized Markup Language (SGML) and the Hypertext Markup Language (HTML), and several interfaces for working with the Extensible Markup Language (XML).
It is important to note that modules in the xml package require that
there be at least one SAX-compliant XML parser available. Starting with Python
2.3, the Expat parser is included with Python, so the xml.parsers.expat
module will always be available. You may still want to be aware of the PyXML
add-on package; that package provides an
extended set of XML libraries for Python.
The documentation for the xml.dom and xml.sax packages are the
definition of the Python bindings for the DOM and SAX interfaces.
- 19.1.
HTMLParserβ Simple HTML and XHTML parser - 19.2.
sgmllibβ Simple SGML parser - 19.3.
htmllibβ A parser for HTML documents - 19.4.
htmlentitydefsβ Definitions of HTML general entities - 19.5. XML Processing Modules
- 19.6. XML vulnerabilities
- 19.7.
xml.etree.ElementTreeβ The ElementTree XML API - 19.8.
xml.domβ The Document Object Model API- 19.8.1. Module Contents
- 19.8.2. Objects in the DOM
- 19.8.2.1. DOMImplementation Objects
- 19.8.2.2. Node Objects
- 19.8.2.3. NodeList Objects
- 19.8.2.4. DocumentType Objects
- 19.8.2.5. Document Objects
- 19.8.2.6. Element Objects
- 19.8.2.7. Attr Objects
- 19.8.2.8. NamedNodeMap Objects
- 19.8.2.9. Comment Objects
- 19.8.2.10. Text and CDATASection Objects
- 19.8.2.11. ProcessingInstruction Objects
- 19.8.2.12. Exceptions
- 19.8.3. Conformance
- 19.9.
xml.dom.minidomβ Minimal DOM implementation - 19.10.
xml.dom.pulldomβ Support for building partial DOM trees - 19.11.
xml.saxβ Support for SAX2 parsers - 19.12.
xml.sax.handlerβ Base classes for SAX handlers - 19.13.
xml.sax.saxutilsβ SAX Utilities - 19.14.
xml.sax.xmlreaderβ Interface for XML parsers - 19.15.
xml.parsers.expatβ Fast XML parsing using Expat
