Regex Backend

Regex backend for act and propriety extraction.

This module contains the ActRegex class, which have all that is necessary to extract an act and, its proprieties, using regex rules.

class dodfminer.extract.polished.backend.regex.ActRegex[source]

Act Regex Class.

This class encapsulate all functions, and attributes related to the process of regex extraction.


This class is one of the fathers of the Base act class.


All the regex flags which will be used in extraction.


The regex rules for proprieties extraction.


The regex rule for act extraction.

_find_prop_value(rule, act)[source]

Find a single proprietie in an single act.

  • rule (str) – The regex rule to search for.

  • act (str) – The act to apply the rule.


The found propriety, or a nan in case nothing is found.


Rules for extraction of the proprieties.

Must return a dictionary of regex rules, where the key is the propriety type and the value is the rule.


NotImplementedError – Child class needs to overwrite this method

classmethod _regex_flags()[source]

Flag of the regex search

_regex_props(act_raw) dict[source]

Create an act dict with all its proprieties.


act_raw (str) – The raw text of a single act.


The act, and its props in a dictionary format.


Rule for extraction of the act


Must return a regex rule that finds an act in two parts, containing a head and a body. Where only the body will be used to search for proprieties.


NotImplementedError – Child class needs to overwrite this method.