docutils.utils package

Miscellaneous utilities for the documentation utilities.

exception SystemMessage(system_message, level)[source]

Bases: ApplicationError

exception SystemMessagePropagation[source]

Bases: ApplicationError

class Reporter(source, report_level, halt_level, stream=None, debug=False, encoding=None, error_handler='backslashreplace')[source]

Bases: object

Info/warning/error reporter and system_message element generator.

Five levels of system messages are defined, along with corresponding methods: debug(), info(), warning(), error(), and severe().

There is typically one Reporter object per process. A Reporter object is instantiated with thresholds for reporting (generating warnings) and halting processing (raising exceptions), a switch to turn debug output on or off, and an I/O stream for warnings. These are stored as instance attributes.

When a system message is generated, its level is compared to the stored thresholds, and a warning or error is generated as appropriate. Debug messages are produced if the stored debug switch is on, independently of other thresholds. Message output is sent to the stored warning stream if not set to ‘’.

The Reporter class also employs a modified form of the “Observer” pattern [GoF95] to track system messages generated. The attach_observer method should be called before parsing, with a bound method or function which accepts system messages. The observer can be removed with detach_observer, and another added in its place.

[GoF95]

Gamma, Helm, Johnson, Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Addison-Wesley, Reading, MA, USA, 1995.

levels = ['DEBUG', 'INFO', 'WARNING', 'ERROR', 'SEVERE']

List of names for system message levels, indexed by level.

DEBUG_LEVEL = 0
INFO_LEVEL = 1
WARNING_LEVEL = 2
ERROR_LEVEL = 3
SEVERE_LEVEL = 4
__init__(source, report_level, halt_level, stream=None, debug=False, encoding=None, error_handler='backslashreplace')[source]
Parameters:
  • source: The path to or description of the source data.

  • report_level: The level at or above which warning output will be sent to stream.

  • halt_level: The level at or above which SystemMessage exceptions will be raised, halting execution.

  • debug: Show debug (level=0) system messages?

  • stream: Where warning output is sent. Can be file-like (has a .write method), a string (file name, opened for writing), ‘’ (empty string) or False (for discarding all stream messages) or None (implies sys.stderr; default).

  • encoding: The output encoding.

  • error_handler: The error handler for stderr output encoding.

source

The path to or description of the source data.

error_handler

The character encoding error handler.

debug_flag

Show debug (level=0) system messages?

report_level

The level at or above which warning output will be sent to self.stream.

halt_level

The level at or above which SystemMessage exceptions will be raised, halting execution.

stream

Where warning output is sent.

encoding

The output character encoding.

observers

List of bound methods or functions to call with each system_message created.

max_level

The highest level system message generated so far.

set_conditions(category, report_level, halt_level, stream=None, debug=False)[source]
attach_observer(observer)[source]

The observer parameter is a function or bound method which takes one argument, a nodes.system_message instance.

detach_observer(observer)[source]
notify_observers(message)[source]
system_message(level, message, *children, **kwargs)[source]

Return a system_message object.

Raise an exception or generate a warning if appropriate.

debug(*args, **kwargs)[source]

Level-0, “DEBUG”: an internal reporting issue. Typically, there is no effect on the processing. Level-0 system messages are handled separately from the others.

info(*args, **kwargs)[source]

Level-1, “INFO”: a minor issue that can be ignored. Typically there is no effect on processing, and level-1 system messages are not reported.

warning(*args, **kwargs)[source]

Level-2, “WARNING”: an issue that should be addressed. If ignored, there may be unpredictable problems with the output.

error(*args, **kwargs)[source]

Level-3, “ERROR”: an error that should be addressed. If ignored, the output will contain errors.

severe(*args, **kwargs)[source]

Level-4, “SEVERE”: a severe error that must be addressed. If ignored, the output will contain severe errors. Typically level-4 system messages are turned into exceptions which halt processing.

exception ExtensionOptionError[source]

Bases: DataError

exception BadOptionError[source]

Bases: ExtensionOptionError

exception BadOptionDataError[source]

Bases: ExtensionOptionError

exception DuplicateOptionError[source]

Bases: ExtensionOptionError

extract_extension_options(field_list, options_spec)[source]

Return a dictionary mapping extension option names to converted values.

Parameters:
  • field_list: A flat field list without field arguments, where each field body consists of a single paragraph only.

  • options_spec: Dictionary mapping known option names to a conversion function such as int or float.

Exceptions:
  • KeyError for unknown option names.

  • ValueError for invalid option values (raised by the conversion

    function).

  • TypeError for invalid option value types (raised by conversion

    function).

  • DuplicateOptionError for duplicate options.

  • BadOptionError for invalid fields.

  • BadOptionDataError for invalid option data (missing name, missing data, bad quotes, etc.).

extract_options(field_list)[source]

Return a list of option (name, value) pairs from field names & bodies.

Parameter:

field_list: A flat field list, where each field name is a single word and each field body consists of a single paragraph only.

Exceptions:
  • BadOptionError for invalid fields.

  • BadOptionDataError for invalid option data (missing name, missing data, bad quotes, etc.).

assemble_option_dict(option_list, options_spec)[source]

Return a mapping of option names to values.

Parameters:
  • option_list: A list of (name, value) pairs (the output of extract_options()).

  • options_spec: Dictionary mapping known option names to a conversion function such as int or float.

Exceptions:
  • KeyError for unknown option names.

  • DuplicateOptionError for duplicate options.

  • ValueError for invalid option values (raised by conversion

    function).

  • TypeError for invalid option value types (raised by conversion

    function).

exception NameValueError[source]

Bases: DataError

decode_path(path)[source]

Ensure path is Unicode. Return str instance.

Decode file/path string in a failsafe manner if not already done.

extract_name_value(line)[source]

Return a list of (name, value) from a line of the form “name=value …”.

Exception:

NameValueError for invalid input (missing name, missing data, bad quotes, etc.).

new_reporter(source_path, settings)[source]

Return a new Reporter object.

Parameters:
sourcestring

The path to or description of the source text of the document.

settingsoptparse.Values object

Runtime settings.

new_document(source_path, settings=None)[source]

Return a new empty document object.

Parameters:
source_pathstring

The path to or description of the source text of the document.

settingsoptparse.Values object

Runtime settings. If none are provided, a default core set will be used. If you will use the document object with any Docutils components, you must provide their default settings as well.

For example, if parsing rST, at least provide the rst-parser settings, obtainable as follows:

Defaults for parser component:

settings = docutils.frontend.get_default_settings(
               docutils.parsers.rst.Parser)

Defaults and configuration file customizations:

settings = docutils.core.Publisher(
    parser=docutils.parsers.rst.Parser).get_settings()
clean_rcs_keywords(paragraph, keyword_substitutions)[source]
relative_path(source, target)[source]

Build and return a path to target, relative to source (both files).

Differences to os.relpath():

  • Inverse argument order.

  • source expects path to a FILE (while os.relpath expects a dir)! (Add a “dummy” file name if source points to a directory.)

  • Always use Posix path separator (“/”) for the output.

  • Use os.sep for parsing the input (ignored by os.relpath()).

  • If there is no common prefix, return the absolute path to target.

get_stylesheet_reference(settings, relative_to=None)[source]

Retrieve a stylesheet reference from the settings object.

Deprecated. Use get_stylesheet_list() instead to enable specification of multiple stylesheets as a comma-separated list.

get_stylesheet_list(settings)[source]

Retrieve list of stylesheet references from the settings object.

find_file_in_dirs(path, dirs)[source]

Search for path in the list of directories dirs.

Return the first expansion that matches an existing file.

get_trim_footnote_ref_space(settings)[source]

Return whether or not to trim footnote space.

If trim_footnote_reference_space is not None, return it.

If trim_footnote_reference_space is None, return False unless the footnote reference style is ‘superscript’.

get_source_line(node)[source]

Return the “source” and “line” attributes from the node given or from its closest ancestor.

escape2null(text)[source]

Return a string with escape-backslashes converted to nulls.

split_escaped_whitespace(text)[source]

Split text on escaped whitespace (null+space or null+newline). Return a list of strings.

strip_combining_chars(text)[source]
find_combining_chars(text)[source]

Return indices of all combining chars in Unicode string text.

>>> from docutils.utils import find_combining_chars
>>> find_combining_chars('A t̆ab̆lĕ')
[3, 6, 9]
column_indices(text)[source]

Indices of Unicode string text when skipping combining characters.

>>> from docutils.utils import column_indices
>>> column_indices('A t̆ab̆lĕ')
[0, 1, 2, 4, 5, 7, 8]
east_asian_widths = {'A': 1, 'F': 2, 'H': 1, 'N': 1, 'Na': 1, 'W': 2}

Mapping of result codes from unicodedata.east_asian_widt() to character column widths.

column_width(text)[source]

Return the column width of text.

Correct len(text) for wide East Asian and combining Unicode chars.

uniq(L)[source]
normalize_language_tag(tag)[source]

Return a list of normalized combinations for a BCP 47 language tag.

Example:

>>> from docutils.utils import normalize_language_tag
>>> normalize_language_tag('de_AT-1901')
['de-at-1901', 'de-at', 'de-1901', 'de']
>>> normalize_language_tag('de-CH-x_altquot')
['de-ch-x-altquot', 'de-ch', 'de-x-altquot', 'de']
xml_declaration(encoding=None)[source]

Return an XML text declaration.

Include an encoding declaration, if encoding is not ‘unicode’, ‘’, or None.

class DependencyList(output_file=None, dependencies=())[source]

Bases: object

List of dependencies, with file recording support.

Note that the output file is not automatically closed. You have to explicitly call the close() method.

__init__(output_file=None, dependencies=())[source]

Initialize the dependency list, automatically setting the output file to output_file (see set_output()) and adding all supplied dependencies.

If output_file is None, no file output is done when calling add().

set_output(output_file)[source]

Set the output file and clear the list of already added dependencies.

output_file must be a string. The specified file is immediately overwritten.

If output_file is ‘-’, the output will be written to stdout.

add(*paths)[source]

Append path to self.list unless it is already there.

Also append to self.file unless it is already there or self.file is `None.

close()[source]

Close the output file.

version_identifier(version_info=None)[source]

Return a version identifier string built from version_info, a docutils.VersionInfo namedtuple instance or compatible tuple. If version_info is not provided, by default return a version identifier string based on docutils.__version_info__ (i.e. the current Docutils version).

Subpackages

Submodules