docutils.parsers.rst.states module

This is the docutils.parsers.rst.states module, the core of the reStructuredText parser. It defines the following:

Classes:
  • RSTStateMachine: reStructuredText parser’s entry point.

  • NestedStateMachine: recursive StateMachine.

  • RSTState: reStructuredText State superclass.

  • Inliner: For parsing inline markup.

  • Body: Generic classifier of the first line of a block.

  • SpecializedBody: Superclass for compound element members.

  • BulletList: Second and subsequent bullet_list list_items

  • DefinitionList: Second+ definition_list_items.

  • EnumeratedList: Second+ enumerated_list list_items.

  • FieldList: Second+ fields.

  • OptionList: Second+ option_list_items.

  • RFC2822List: Second+ RFC2822-style fields.

  • ExtensionOptions: Parses directive option fields.

  • Explicit: Second+ explicit markup constructs.

  • SubstitutionDef: For embedded directives in substitution definitions.

  • Text: Classifier of second line of a text block.

  • SpecializedText: Superclass for continuation lines of Text-variants.

  • Definition: Second line of potential definition_list_item.

  • Line: Second line of overlined section title or transition marker.

  • Struct: An auxiliary collection class.

Exception classes:
  • MarkupError

  • ParserError

  • MarkupMismatch

Functions:
  • escape2null(): Return a string, escape-backslashes converted to nulls.

  • unescape(): Return a string, nulls removed or restored to backslashes.

Attributes:
  • state_classes: set of State classes used with RSTStateMachine.

Parser Overview

The reStructuredText parser is implemented as a recursive state machine, examining its input one line at a time. To understand how the parser works, please first become familiar with the docutils.statemachine module. In the description below, references are made to classes defined in this module; please see the individual classes for details.

Parsing proceeds as follows:

  1. The state machine examines each line of input, checking each of the transition patterns of the state Body, in order, looking for a match. The implicit transitions (blank lines and indentation) are checked before any others. The ‘text’ transition is a catch-all (matches anything).

  2. The method associated with the matched transition pattern is called.

    1. Some transition methods are self-contained, appending elements to the document tree (Body.doctest parses a doctest block). The parser’s current line index is advanced to the end of the element, and parsing continues with step 1.

    2. Other transition methods trigger the creation of a nested state machine, whose job is to parse a compound construct (‘indent’ does a block quote, ‘bullet’ does a bullet list, ‘overline’ does a section [first checking for a valid section header], etc.).

      • In the case of lists and explicit markup, a one-off state machine is created and run to parse contents of the first item.

      • A new state machine is created and its initial state is set to the appropriate specialized state (BulletList in the case of the ‘bullet’ transition; see SpecializedBody for more detail). This state machine is run to parse the compound element (or series of explicit markup elements), and returns as soon as a non-member element is encountered. For example, the BulletList state machine ends as soon as it encounters an element which is not a list item of that bullet list. The optional omission of inter-element blank lines is enabled by this nested state machine.

      • The current line index is advanced to the end of the elements parsed, and parsing continues with step 1.

    3. The result of the ‘text’ transition depends on the next line of text. The current state is changed to Text, under which the second line is examined. If the second line is:

      • Indented: The element is a definition list item, and parsing proceeds similarly to step 2.B, using the DefinitionList state.

      • A line of uniform punctuation characters: The element is a section header; again, parsing proceeds as in step 2.B, and Body is still used.

      • Anything else: The element is a paragraph, which is examined for inline markup and appended to the parent element. Processing continues with step 1.

exception MarkupError[source]

Bases: DataError

exception UnknownInterpretedRoleError[source]

Bases: DataError

exception InterpretedRoleNotImplementedError[source]

Bases: DataError

exception ParserError[source]

Bases: ApplicationError

exception MarkupMismatch[source]

Bases: Exception

class Struct(**keywordargs)[source]

Bases: object

Stores data attributes for dotted-attribute access.

class RSTStateMachine(state_classes, initial_state, debug=False)[source]

Bases: StateMachineWS

reStructuredText’s master StateMachine.

The entry point to reStructuredText parsing is the run() method.

run(input_lines, document, input_offset=0, match_titles=True, inliner=None)[source]

Parse input_lines and modify the document node in place.

Extend StateMachineWS.run(): set up parse-global data and run the StateMachine.

class NestedStateMachine(state_classes, initial_state, debug=False)[source]

Bases: StateMachineWS

StateMachine run from within other StateMachine runs, to parse nested document structures.

run(input_lines, input_offset, memo, node, match_titles=True)[source]

Parse input_lines and populate a docutils.nodes.document instance.

Extend StateMachineWS.run(): set up document-wide data.

class RSTState(state_machine, debug=False)[source]

Bases: StateWS

reStructuredText State superclass.

Contains methods used by all State subclasses.

nested_sm

alias of NestedStateMachine

nested_sm_cache = [<docutils.parsers.rst.states.NestedStateMachine object>, <docutils.parsers.rst.states.NestedStateMachine object>, <docutils.parsers.rst.states.NestedStateMachine object>, <docutils.parsers.rst.states.NestedStateMachine object>, <docutils.parsers.rst.states.NestedStateMachine object>, <docutils.parsers.rst.states.NestedStateMachine object>]
runtime_init()[source]

Initialize this State before running the state machine; called from self.state_machine.run().

goto_line(abs_line_offset)[source]

Jump to input line abs_line_offset, ignoring jumps past the end.

no_match(context, transitions)[source]

Override StateWS.no_match to generate a system message.

This code should never be run.

bof(context)[source]

Called at beginning of file.

nested_parse(block, input_offset, node, match_titles=False, state_machine_class=None, state_machine_kwargs=None)[source]

Create a new StateMachine rooted at node and run it over the input block.

nested_list_parse(block, input_offset, node, initial_state, blank_finish, blank_finish_state=None, extra_settings={}, match_titles=False, state_machine_class=None, state_machine_kwargs=None)[source]

Create a new StateMachine rooted at node and run it over the input block. Also keep track of optional intermediate blank lines and the required final one.

section(title, source, style, lineno, messages)[source]

Check for a valid subsection and create one if it checks out.

check_subsection(source, style, lineno)[source]

Check for a valid subsection header. Return True or False.

When a new section is reached that isn’t a subsection of the current section, back up the line count (use previous_line(-x)), then raise EOFError. The current StateMachine will finish, then the calling StateMachine can re-examine the title. This will work its way back up the calling chain until the correct section level isreached.

@@@ Alternative: Evaluate the title, store the title info & level, and back up the chain until that level is reached. Store in memo? Or return in results?

Exception:

EOFError when a sibling or supersection encountered.

title_inconsistent(sourcetext, lineno)[source]
new_subsection(title, lineno, messages)[source]

Append new subsection to document tree. On return, check level.

paragraph(lines, lineno)[source]

Return a list (paragraph & messages) & a boolean: literal_block next?

inline_text(text, lineno)[source]

Return 2 lists: nodes (text and inline elements), and system_messages.

unindent_warning(node_name)[source]
build_regexp(definition, compile=True)[source]

Build, compile and return a regular expression based on definition.

Parameter:

definition: a 4-tuple (group name, prefix, suffix, parts), where “parts” is a list of regular expressions and/or regular expression definitions to be joined into an or-group.

class Inliner[source]

Bases: object

Parse inline markup; call the parse() method.

implicit_dispatch

List of (pattern, bound method) tuples, used by self.implicit_inline.

init_customizations(settings)[source]
parse(text, lineno, memo, parent)[source]

Return 2 lists: nodes (text and inline elements), and system_messages.

Using self.patterns.initial, a pattern which matches start-strings (emphasis, strong, interpreted, phrase reference, literal, substitution reference, and inline target) and complete constructs (simple reference, footnote reference), search for a candidate. When one is found, check for validity (e.g., not a quoted ‘*’ character). If valid, search for the corresponding end string if applicable, and check it for validity. If not found or invalid, generate a warning and ignore the start-string. Implicit inline markup (e.g. standalone URIs) is found last.

Text:

source string

Lineno:

absolute line number (cf. statemachine.get_source_and_line())

non_whitespace_before = '(?<!\\s)'
non_whitespace_escape_before = '(?<![\\s\\x00])'
non_unescaped_whitespace_escape_before = '(?<!(?<!\\x00)[\\s\\x00])'
non_whitespace_after = '(?!\\s)'
simplename = '(?:(?!_)\\w)+(?:[-._+:](?:(?!_)\\w)+)*'
uric = "[-_.!~*'()[\\];/:@&=+$,%a-zA-Z0-9\\x00]"
uri_end_delim = '[>]'
urilast = '[_~*/=+a-zA-Z0-9]'
uri_end = "(?:[_~*/=+a-zA-Z0-9]|[-_.!~*'()[\\];/:@&=+$,%a-zA-Z0-9\\x00](?=[>]))"
emailc = "[-_!~*'{|}/#?^`&=+$%a-zA-Z0-9\\x00]"
email_pattern = '\n          %(emailc)s+(?:\\.%(emailc)s+)*   # name\n          (?<!\\x00)@                      # at\n          %(emailc)s+(?:\\.%(emailc)s*)*   # host\n          %(uri_end)s                     # final URI char\n          '
quoted_start(match)[source]

Test if inline markup start-string is ‘quoted’.

‘Quoted’ in this context means the start-string is enclosed in a pair of matching opening/closing delimiters (not necessarily quotes) or at the end of the match.

inline_obj(match, lineno, end_pattern, nodeclass, restore_backslashes=False)[source]
problematic(text, rawsource, message)[source]
emphasis(match, lineno)[source]
strong(match, lineno)[source]
interpreted_or_phrase_ref(match, lineno)[source]
phrase_ref(before, after, rawsource, escaped, text=None)[source]
adjust_uri(uri)[source]
interpreted(rawsource, text, role, lineno)[source]
literal(match, lineno)[source]
inline_internal_target(match, lineno)[source]
substitution_reference(match, lineno)[source]
footnote_reference(match, lineno)[source]

Handles nodes.footnote_reference and nodes.citation_reference elements.

reference(match, lineno, anonymous=False)[source]
anonymous_reference(match, lineno)[source]
standalone_uri(match, lineno)[source]
pep_reference(match, lineno)[source]
rfc_url = 'rfc%d.html'
rfc_reference(match, lineno)[source]
implicit_inline(text, lineno)[source]

Check each of the patterns in self.implicit_dispatch for a match, and dispatch to the stored method for the pattern. Recursively check the text before and after the match. Return a list of nodes.Text and inline element nodes.

dispatch = {'*': <function Inliner.emphasis>, '**': <function Inliner.strong>, ']_': <function Inliner.footnote_reference>, '_': <function Inliner.reference>, '__': <function Inliner.anonymous_reference>, '_`': <function Inliner.inline_internal_target>, '`': <function Inliner.interpreted_or_phrase_ref>, '``': <function Inliner.literal>, '|': <function Inliner.substitution_reference>}
_loweralpha_to_int(s, _zero=96)[source]
_upperalpha_to_int(s, _zero=64)[source]
_lowerroman_to_int(s)[source]
class Body(state_machine, debug=False)[source]

Bases: RSTState

Generic classifier of the first line of a block.

double_width_pad_char = '\x00'

Padding character for East Asian double-width text.

enum = <docutils.parsers.rst.states.Struct object>

Enumerated list parsing information.

grid_table_top_pat = re.compile('\\+-[-+]+-\\+ *$')

Matches the top (& bottom) of a full table).

simple_table_top_pat = re.compile('=+( +=+)+ *$')

Matches the top of a simple table.

simple_table_border_pat = re.compile('=+[ =]*$')

Matches the bottom & header bottom of a simple table.

pats = {'alpha': '[a-zA-Z]', 'alphanum': '[a-zA-Z0-9]', 'alphanumplus': '[a-zA-Z0-9_-]', 'enum': '([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)', 'longopt': '(--|/)[a-zA-Z0-9][a-zA-Z0-9_-]*([ =]([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?', 'nonalphanum7bit': '[!-/:-@[-`{-~]', 'optarg': '([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>)', 'option': '((-|\\+)[a-zA-Z0-9]( ?([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?|(--|/)[a-zA-Z0-9][a-zA-Z0-9_-]*([ =]([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?)', 'optname': '[a-zA-Z0-9][a-zA-Z0-9_-]*', 'parens': '(?P<parens>\\(([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\))', 'period': '(?P<period>([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\.)', 'rparen': '(?P<rparen>([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\))', 'shortopt': '(-|\\+)[a-zA-Z0-9]( ?([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?'}

Fragments of patterns used by transitions.

patterns = {'anonymous': re.compile('__( +|$)'), 'blank': re.compile(' *$'), 'bullet': re.compile('[-+*•‣⁃]( +|$)'), 'doctest': re.compile('>>>( +|$)'), 'enumerator': re.compile('((?P<parens>\\(([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\))|(?P<rparen>([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\))|(?P<period>([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\.))( +|$)'), 'explicit_markup': re.compile('\\.\\.( +|$)'), 'field_marker': re.compile(':(?![: ])([^:\\\\]|\\\\.|:(?!([ `]|$)))*(?<! ):( +|$)'), 'grid_table_top': re.compile('\\+-[-+]+-\\+ *$'), 'indent': re.compile(' +'), 'line': re.compile('([!-/:-@[-`{-~])\\1* *$'), 'line_block': re.compile('\\|( +|$)'), 'option_marker': re.compile('((-|\\+)[a-zA-Z0-9]( ?([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?|(--|/)[a-zA-Z0-9][a-zA-Z0-9_-]*([ =]([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?)(, ((-|\\+)[a-zA-Z0-9]( ?([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?|(--|/)[a), 'simple_table_top': re.compile('=+( +=+)+ *$'), 'text': re.compile('')}

pattern} mapping, used by make_transition(). Each pattern may be a string or a compiled re pattern. Override in subclasses.

Type:

{Name

initial_transitions = ('bullet', 'enumerator', 'field_marker', 'option_marker', 'doctest', 'line_block', 'grid_table_top', 'simple_table_top', 'explicit_markup', 'anonymous', 'line', 'text')

A list of transitions to initialize when a State is instantiated. Each entry is either a transition name string, or a (transition name, next state name) pair. See make_transitions(). Override in subclasses.

indent(match, context, next_state)[source]

Block quote.

block_quote(indented, line_offset)[source]
attribution_pattern = re.compile('(---?(?!-)|—) *(?=[^ \\n])')
split_attribution(indented, line_offset)[source]

Check for a block quote attribution and split it off:

  • First line after a blank line must begin with a dash (”–”, “—“, em-dash; matches self.attribution_pattern).

  • Every line after that must have consistent indentation.

  • Attributions must be preceded by block quote content.

Return a tuple of: (block quote content lines, attribution lines, attribution offset, remaining indented lines, remaining lines offset).

check_attribution(indented, attribution_start)[source]

Check attribution shape. Return the index past the end of the attribution, and the indent.

parse_attribution(indented, line_offset)[source]
bullet(match, context, next_state)[source]

Bullet list item.

list_item(indent)[source]
enumerator(match, context, next_state)[source]

Enumerated List Item

parse_enumerator(match, expected_sequence=None)[source]

Analyze an enumerator and return the results.

Return:
  • the enumerator format (‘period’, ‘parens’, or ‘rparen’),

  • the sequence used (‘arabic’, ‘loweralpha’, ‘upperroman’, etc.),

  • the text of the enumerator, stripped of formatting, and

  • the ordinal value of the enumerator (‘a’ -> 1, ‘ii’ -> 2, etc.; None is returned for invalid enumerator text).

The enumerator format has already been determined by the regular expression match. If expected_sequence is given, that sequence is tried first. If not, we check for Roman numeral 1. This way, single-character Roman numerals (which are also alphabetical) can be matched. If no sequence has been matched, all sequences are checked in order.

is_enumerated_list_item(ordinal, sequence, format)[source]

Check validity based on the ordinal value and the second line.

Return true if the ordinal is valid and the second line is blank, indented, or starts with the next enumerator or an auto-enumerator.

make_enumerator(ordinal, sequence, format)[source]

Construct and return the next enumerated list item marker, and an auto-enumerator (“#” instead of the regular enumerator).

Return None for invalid (out of range) ordinals.

field_marker(match, context, next_state)[source]

Field list item.

field(match)[source]
parse_field_marker(match)[source]

Extract & return field name from a field marker match.

parse_field_body(indented, offset, node)[source]
option_marker(match, context, next_state)[source]

Option list item.

option_list_item(match)[source]
parse_option_marker(match)[source]

Return a list of node.option and node.option_argument objects, parsed from an option marker match.

Exception:

MarkupError for invalid option markers.

doctest(match, context, next_state)[source]
line_block(match, context, next_state)[source]

First line of a line block.

line_block_line(match, lineno)[source]

Return one line element of a line_block.

nest_line_block_lines(block)[source]
nest_line_block_segment(block)[source]
grid_table_top(match, context, next_state)[source]

Top border of a full table.

simple_table_top(match, context, next_state)[source]

Top border of a simple table.

table_top(match, context, next_state, isolate_function, parser_class)[source]

Top border of a generic table.

table(isolate_function, parser_class)[source]

Parse a table.

isolate_grid_table()[source]
isolate_simple_table()[source]
malformed_table(block, detail='', offset=0)[source]
build_table(tabledata, tableline, stub_columns=0, widths=None)[source]
build_table_row(rowdata, tableline)[source]
explicit = <docutils.parsers.rst.states.Struct object>

Patterns and constants used for explicit markup recognition.

footnote(match)[source]
citation(match)[source]
make_target(block, block_text, lineno, target_name)[source]
parse_target(block, block_text, lineno)[source]

Determine the type of reference of a target.

Return:

A 2-tuple, one of:

  • ‘refname’ and the indirect reference name

  • ‘refuri’ and the URI

  • ‘malformed’ and a system_message node

is_reference(reference)[source]
add_target(targetname, refuri, target, lineno)[source]
substitution_def(match)[source]
disallowed_inside_substitution_definitions(node)[source]
directive(match, **option_presets)[source]

Returns a 2-tuple: list of nodes, and a “blank finish” boolean.

run_directive(directive, match, type_name, option_presets)[source]

Parse a directive then run its directive function.

Parameters:

  • directive: The class implementing the directive. Must be a subclass of rst.Directive.

  • match: A regular expression match object which matched the first line of the directive.

  • type_name: The directive name, as used in the source text.

  • option_presets: A dictionary of preset options, defaults for the directive options. Currently, only an “alt” option is passed by substitution definitions (value: the substitution name), which may be used by an embedded image directive.

Returns a 2-tuple: list of nodes, and a “blank finish” boolean.

parse_directive_block(indented, line_offset, directive, option_presets)[source]
parse_directive_options(option_presets, option_spec, arg_block)[source]
parse_directive_arguments(directive, arg_block)[source]
parse_extension_options(option_spec, datalines)[source]

Parse datalines for a field list containing extension options matching option_spec.

Parameters:
  • option_spec: a mapping of option name to conversion function, which should raise an exception on bad input.

  • datalines: a list of input strings.

Return:
  • Success value, 1 or 0.

  • An option dictionary on success, an error string on failure.

unknown_directive(type_name)[source]
comment(match)[source]
explicit_markup(match, context, next_state)[source]

Footnotes, hyperlink targets, directives, comments.

explicit_construct(match)[source]

Determine which explicit construct this is, parse & return it.

explicit_list(blank_finish)[source]

Create a nested state machine for a series of explicit markup constructs (including anonymous hyperlink targets).

anonymous(match, context, next_state)[source]

Anonymous hyperlink targets.

anonymous_target(match)[source]
line(match, context, next_state)[source]

Section title overline or transition marker.

text(match, context, next_state)[source]

Titles, definition lists, paragraphs.

format = 'period'
sequence = 'upperroman'
class RFC2822Body(state_machine, debug=False)[source]

Bases: Body

RFC2822 headers are only valid as the first constructs in documents. As soon as anything else appears, the Body state should take over.

patterns = {'anonymous': re.compile('__( +|$)'), 'blank': re.compile(' *$'), 'bullet': re.compile('[-+*•‣⁃]( +|$)'), 'doctest': re.compile('>>>( +|$)'), 'enumerator': re.compile('((?P<parens>\\(([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\))|(?P<rparen>([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\))|(?P<period>([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\.))( +|$)'), 'explicit_markup': re.compile('\\.\\.( +|$)'), 'field_marker': re.compile(':(?![: ])([^:\\\\]|\\\\.|:(?!([ `]|$)))*(?<! ):( +|$)'), 'grid_table_top': re.compile('\\+-[-+]+-\\+ *$'), 'indent': re.compile(' +'), 'line': re.compile('([!-/:-@[-`{-~])\\1* *$'), 'line_block': re.compile('\\|( +|$)'), 'option_marker': re.compile('((-|\\+)[a-zA-Z0-9]( ?([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?|(--|/)[a-zA-Z0-9][a-zA-Z0-9_-]*([ =]([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?)(, ((-|\\+)[a-zA-Z0-9]( ?([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?|(--|/)[a), 'rfc2822': re.compile('[!-9;-~]+:( +|$)'), 'simple_table_top': re.compile('=+( +=+)+ *$'), 'text': re.compile('')}

pattern} mapping, used by make_transition(). Each pattern may be a string or a compiled re pattern. Override in subclasses.

Type:

{Name

initial_transitions = [('bullet', 'Body'), ('enumerator', 'Body'), ('field_marker', 'Body'), ('option_marker', 'Body'), ('doctest', 'Body'), ('line_block', 'Body'), ('grid_table_top', 'Body'), ('simple_table_top', 'Body'), ('explicit_markup', 'Body'), ('anonymous', 'Body'), ('line', 'Body'), ('rfc2822', 'Body'), ('text', 'Body')]

A list of transitions to initialize when a State is instantiated. Each entry is either a transition name string, or a (transition name, next state name) pair. See make_transitions(). Override in subclasses.

rfc2822(match, context, next_state)[source]

RFC2822-style field list item.

rfc2822_field(match)[source]
class SpecializedBody(state_machine, debug=False)[source]

Bases: Body

Superclass for second and subsequent compound element members. Compound elements are lists and list-like constructs.

All transition methods are disabled (redefined as invalid_input). Override individual methods in subclasses to re-enable.

For example, once an initial bullet list item, say, is recognized, the BulletList subclass takes over, with a “bullet_list” node as its container. Upon encountering the initial bullet list item, Body.bullet calls its self.nested_list_parse (RSTState.nested_list_parse), which starts up a nested parsing session with BulletList as the initial state. Only the bullet transition method is enabled in BulletList; as long as only bullet list items are encountered, they are parsed and inserted into the container. The first construct which is not a bullet list item triggers the invalid_input method, which ends the nested parse and closes the container. BulletList needs to recognize input that is invalid in the context of a bullet list, which means everything other than bullet list items, so it inherits the transition list created in Body.

invalid_input(match=None, context=None, next_state=None)[source]

Not a compound element member. Abort this state machine.

indent(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

bullet(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

enumerator(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

field_marker(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

option_marker(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

doctest(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

line_block(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

grid_table_top(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

simple_table_top(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

explicit_markup(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

anonymous(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

line(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

text(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

class BulletList(state_machine, debug=False)[source]

Bases: SpecializedBody

Second and subsequent bullet_list list_items.

bullet(match, context, next_state)[source]

Bullet list item.

class DefinitionList(state_machine, debug=False)[source]

Bases: SpecializedBody

Second and subsequent definition_list_items.

text(match, context, next_state)[source]

Definition lists.

class EnumeratedList(state_machine, debug=False)[source]

Bases: SpecializedBody

Second and subsequent enumerated_list list_items.

enumerator(match, context, next_state)[source]

Enumerated list item.

class FieldList(state_machine, debug=False)[source]

Bases: SpecializedBody

Second and subsequent field_list fields.

field_marker(match, context, next_state)[source]

Field list field.

class OptionList(state_machine, debug=False)[source]

Bases: SpecializedBody

Second and subsequent option_list option_list_items.

option_marker(match, context, next_state)[source]

Option list item.

class RFC2822List(state_machine, debug=False)[source]

Bases: SpecializedBody, RFC2822Body

Second and subsequent RFC2822-style field_list fields.

patterns = {'anonymous': re.compile('__( +|$)'), 'blank': re.compile(' *$'), 'bullet': re.compile('[-+*•‣⁃]( +|$)'), 'doctest': re.compile('>>>( +|$)'), 'enumerator': re.compile('((?P<parens>\\(([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\))|(?P<rparen>([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\))|(?P<period>([0-9]+|[a-z]|[A-Z]|[ivxlcdm]+|[IVXLCDM]+|#)\\.))( +|$)'), 'explicit_markup': re.compile('\\.\\.( +|$)'), 'field_marker': re.compile(':(?![: ])([^:\\\\]|\\\\.|:(?!([ `]|$)))*(?<! ):( +|$)'), 'grid_table_top': re.compile('\\+-[-+]+-\\+ *$'), 'indent': re.compile(' +'), 'line': re.compile('([!-/:-@[-`{-~])\\1* *$'), 'line_block': re.compile('\\|( +|$)'), 'option_marker': re.compile('((-|\\+)[a-zA-Z0-9]( ?([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?|(--|/)[a-zA-Z0-9][a-zA-Z0-9_-]*([ =]([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?)(, ((-|\\+)[a-zA-Z0-9]( ?([a-zA-Z][a-zA-Z0-9_-]*|<[^<>]+>))?|(--|/)[a), 'rfc2822': re.compile('[!-9;-~]+:( +|$)'), 'simple_table_top': re.compile('=+( +=+)+ *$'), 'text': re.compile('')}

pattern} mapping, used by make_transition(). Each pattern may be a string or a compiled re pattern. Override in subclasses.

Type:

{Name

initial_transitions = [('bullet', 'Body'), ('enumerator', 'Body'), ('field_marker', 'Body'), ('option_marker', 'Body'), ('doctest', 'Body'), ('line_block', 'Body'), ('grid_table_top', 'Body'), ('simple_table_top', 'Body'), ('explicit_markup', 'Body'), ('anonymous', 'Body'), ('line', 'Body'), ('rfc2822', 'Body'), ('text', 'Body')]

A list of transitions to initialize when a State is instantiated. Each entry is either a transition name string, or a (transition name, next state name) pair. See make_transitions(). Override in subclasses.

rfc2822(match, context, next_state)[source]

RFC2822-style field list item.

blank(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

class ExtensionOptions(state_machine, debug=False)[source]

Bases: FieldList

Parse field_list fields for extension options.

No nested parsing is done (including inline markup parsing).

parse_field_body(indented, offset, node)[source]

Override Body.parse_field_body for simpler parsing.

class LineBlock(state_machine, debug=False)[source]

Bases: SpecializedBody

Second and subsequent lines of a line_block.

blank(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

line_block(match, context, next_state)[source]

New line of line block.

class Explicit(state_machine, debug=False)[source]

Bases: SpecializedBody

Second and subsequent explicit markup construct.

explicit_markup(match, context, next_state)[source]

Footnotes, hyperlink targets, directives, comments.

anonymous(match, context, next_state)[source]

Anonymous hyperlink targets.

blank(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

class SubstitutionDef(state_machine, debug=False)[source]

Bases: Body

Parser for the contents of a substitution_definition element.

patterns = {'blank': re.compile(' *$'), 'embedded_directive': re.compile('((?:(?!_)\\w)+(?:[-._+:](?:(?!_)\\w)+)*)::( +|$)'), 'indent': re.compile(' +'), 'text': re.compile('')}

pattern} mapping, used by make_transition(). Each pattern may be a string or a compiled re pattern. Override in subclasses.

Type:

{Name

initial_transitions = ['embedded_directive', 'text']

A list of transitions to initialize when a State is instantiated. Each entry is either a transition name string, or a (transition name, next state name) pair. See make_transitions(). Override in subclasses.

embedded_directive(match, context, next_state)[source]
text(match, context, next_state)[source]

Titles, definition lists, paragraphs.

class Text(state_machine, debug=False)[source]

Bases: RSTState

Classifier of second line of a text block.

Could be a paragraph, a definition list item, or a title.

patterns = {'blank': re.compile(' *$'), 'indent': re.compile(' +'), 'text': re.compile(''), 'underline': re.compile('([!-/:-@[-`{-~])\\1* *$')}

pattern} mapping, used by make_transition(). Each pattern may be a string or a compiled re pattern. Override in subclasses.

Type:

{Name

initial_transitions = [('underline', 'Body'), ('text', 'Body')]

A list of transitions to initialize when a State is instantiated. Each entry is either a transition name string, or a (transition name, next state name) pair. See make_transitions(). Override in subclasses.

blank(match, context, next_state)[source]

End of paragraph.

eof(context)[source]

Handle end-of-file. Return empty result.

Override in subclasses.

Parameter context: application-defined storage.

indent(match, context, next_state)[source]

Definition list item.

underline(match, context, next_state)[source]

Section title.

text(match, context, next_state)[source]

Paragraph.

literal_block()[source]

Return a list of nodes.

quoted_literal_block()[source]
definition_list_item(termline)[source]
classifier_delimiter = re.compile(' +: +')
term(lines, lineno)[source]

Return a definition_list’s term and optional classifiers.

class SpecializedText(state_machine, debug=False)[source]

Bases: Text

Superclass for second and subsequent lines of Text-variants.

All transition methods are disabled. Override individual methods in subclasses to re-enable.

eof(context)[source]

Incomplete construct.

invalid_input(match=None, context=None, next_state=None)[source]

Not a compound element member. Abort this state machine.

blank(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

indent(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

underline(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

text(match=None, context=None, next_state=None)

Not a compound element member. Abort this state machine.

class Definition(state_machine, debug=False)[source]

Bases: SpecializedText

Second line of potential definition_list_item.

eof(context)[source]

Not a definition.

indent(match, context, next_state)[source]

Definition list item.

class Line(state_machine, debug=False)[source]

Bases: SpecializedText

Second line of over- & underlined section title or transition marker.

eofcheck = 1

Set to 0 while parsing sections, so that we don’t catch the EOF.

eof(context)[source]

Transition marker at end of section or document.

blank(match, context, next_state)[source]

Transition marker.

text(match, context, next_state)[source]

Potential over- & underlined title.

indent(match, context, next_state)

Potential over- & underlined title.

underline(match, context, next_state)[source]

Not a compound element member. Abort this state machine.

short_overline(context, blocktext, lineno, lines=1)[source]
state_correction(context, lines=1)[source]
class QuotedLiteralBlock(state_machine, debug=False)[source]

Bases: RSTState

Nested parse handler for quoted (unindented) literal blocks.

Special-purpose. Not for inclusion in state_classes.

patterns = {'initial_quoted': '([!-/:-@[-`{-~])', 'text': ''}

pattern} mapping, used by make_transition(). Each pattern may be a string or a compiled re pattern. Override in subclasses.

Type:

{Name

initial_transitions = ('initial_quoted', 'text')

A list of transitions to initialize when a State is instantiated. Each entry is either a transition name string, or a (transition name, next state name) pair. See make_transitions(). Override in subclasses.

blank(match, context, next_state)[source]

Handle blank lines. Does nothing. Override in subclasses.

eof(context)[source]

Handle end-of-file. Return empty result.

Override in subclasses.

Parameter context: application-defined storage.

indent(match, context, next_state)[source]

Handle an indented text block. Extend or override in subclasses.

Recursively run the registered state machine for indented blocks (self.indent_sm).

initial_quoted(match, context, next_state)[source]

Match arbitrary quote character on the first line only.

quoted(match, context, next_state)[source]

Match consistent quotes on subsequent lines.

text(match, context, next_state)[source]
state_classes = (<class 'docutils.parsers.rst.states.Body'>, <class 'docutils.parsers.rst.states.BulletList'>, <class 'docutils.parsers.rst.states.DefinitionList'>, <class 'docutils.parsers.rst.states.EnumeratedList'>, <class 'docutils.parsers.rst.states.FieldList'>, <class 'docutils.parsers.rst.states.OptionList'>, <class 'docutils.parsers.rst.states.LineBlock'>, <class 'docutils.parsers.rst.states.ExtensionOptions'>, <class 'docutils.parsers.rst.states.Explicit'>, <class 'docutils.parsers.rst.states.Text'>, <class 'docutils.parsers.rst.states.Definition'>, <class 'docutils.parsers.rst.states.Line'>, <class 'docutils.parsers.rst.states.SubstitutionDef'>, <class 'docutils.parsers.rst.states.RFC2822Body'>, <class 'docutils.parsers.rst.states.RFC2822List'>)

Standard set of State classes used to start RSTStateMachine.