Acts

Acts are always built as a child class from the Base class Atos. Following are the base class structure and a guide for implementating your own act. Also, a list of implementated and missing acts are presented.

Base Class

class dodfminer.extract.polished.acts.base.Atos(file_name, backend='regex', pipeline=None)[source]

Base class for extracting an act and its proprieties to a dataframe.

Note

You should not use this class alone, use its childs on the regex module.

Parameters
  • file (str) – The dodf file path.

  • backend (str) – The mechanism to use in extraction. Can be either regex or ner. Defaults to regex.

_file_name

The dodf file path.

Type

str

_text

The dodf content in string format.

Type

str

_acts_str

List of raw text acts.

Type

str

_name

Name of the act.

Type

str

_columns

List of the proprieties names from the act.

Type

str

_raw_acts

List of raw text acts .

Type

list

_acts

List of acts with propreties extracted.

Type

list

_data_frame

The resulting dataframe from the extraction process.

Type

dataframe

property acts_str

Vector of acts content as raw text.

Type

str

property data_frame

Act dataframe with proprieties extracted.

Type

dataframe

get_expected_colunms() list[source]

Get the expected columns for the dataframe :raises NotImplementedError: Child class needs to overwrite this method.

property name

Name of the act.

Type

str

read_json(file_name)[source]

Reads a .json file of a DODF.

A single string with all the relevant text from the act section is extracted.

read_txt(file_name)[source]

Reads a .txt file of a DODF.

A single string with all the text of the file is extracted.

Implementing new acts

The Acts base class is build in a way to make easy implementation of new acts. A programmer seeking to help in the development of new acts, need not to worry about anything, besides the regex or ner itself.

Mainly, the following funcions need to be overwrited in the child class.

Atos._act_name()[source]

Name of the act.

Must return a single string representing the act name

Raises

NotImplementedError – Child class needs to overwrite this method.

Atos._props_names()[source]

Name of all the proprieties for the dataframe column.

Must return a vector of string representing the proprieties names

Warning

The first name will be used for the type-of-act propriety.

Raises

NotImplementedError – Child class needs to overwrite this method.

Regex Methods

In case you want to extract through regex, the following funcions needs to be written:

ActRegex._rule_for_inst()[source]

Rule for extraction of the act

Warning

Must return a regex rule that finds an act in two parts, containing a head and a body. Where only the body will be used to search for proprieties.

Raises

NotImplementedError – Child class needs to overwrite this method.

ActRegex._prop_rules()[source]

Rules for extraction of the proprieties.

Must return a dictionary of regex rules, where the key is the propriety type and the value is the rule.

Raises

NotImplementedError – Child class needs to overwrite this method

Additionaly, if the programmer whishes to change the regex flags for his/her class, they can overwrite the following function in the child class:

classmethod ActRegex._regex_flags()[source]

Flag of the regex search

NER Methods

If NER will be used, you shall add a trained model to the acts/models folder. Also the following method should be overwrited in your act:

ActNER._load_model()[source]

Load Model from models/folder.

Note

This function needs to be overwriten in the child class. If this function is not overwrite the backend will fall back to regex.

Change the Core File

After all functions have been implemented, the programmer needs to do a minor change in the core file. The following must be added:

from dodfminer.extract.polished.acts.act_file_name import NewActClass
_acts_ids["new_act_name"] = NewActClass

Base Class Mechanisms

One does not access directly none of those functions, but they are listed here in case the programmer implementing the act needs more informations.

Atos._extract_props()[source]

Extract proprieties of all the acts.

Returns

A vector of extracted acts dictionaries.

Atos._build_dataframe()[source]

Create a dataframe with the extracted proprieties.

Returns

The dataframe created

Implemented Acts and Properties

  • Abono
    • Nome

    • Matricula

    • Cargo_efetivo

    • Classe

    • Padrao

    • Quadro

    • Fundamento_legal

    • Orgao

    • Processo_sei

    • Vigencia

    • Matricula_siape

    • Cargo

    • Lotacao

  • Aposentadoria
    • Ato

    • Processo

    • Nome_ato

    • Cod_matricula_ato

    • Cargo

    • Classe

    • Padrao

    • Quadro

    • Fund_legal

    • Empresa_ato

  • Exoneração Efetivos
    • Nome

    • Matricula

    • Cargo_efetivo

    • Classe

    • Padrao

    • Carreira

    • Quadro

    • Processo_sei

    • Vigencia

    • A_pedido_ou_nao

    • Motivo

    • Fundamento_legal

    • Orgao

    • Simbolo

    • Hierarquia_lotacao

    • Cargo_comissionado

  • Exoneração Comissionados
    • Nome

    • Matricula

    • Simbolo

    • Cargo_comissionado

    • Hierarquia_lotacao

    • Orgao

    • Vigencia

    • Carreir

    • Fundamento_legal

    • A_pedido_ou_nao

    • Cargo_efetivo

    • Matricula_siape

    • Motivo

  • Nomeação Efetivos
    • Edital_normativo

    • Data_edital_normativo

    • Numero_dodf_edital_normativo

    • Data_dodf_edital_normativo

    • Edital_resultado_final

    • Data_edital_resultado_final

    • Numero_dodf_resultado_final

    • Data_dodf_resultado_final

    • Cargo

    • Especialidade

    • Carreira

    • Orgao

    • Candidato

    • Classe

    • Quadro

    • Candidato_pne

    • Padrao

  • Nomeação Comissionados
    • Edital_normativo

    • Data_edital_normativo

    • Numero_dodf_edital_normativo

    • Data_dodf_edital_normativo

    • Edital_resultado_final

    • Data_edital_resultado_final

    • Numero_dodf_resultado_final

    • Data_dodf_resultado_final

    • Cargo

    • Especialidade

    • Carreira

    • Orgao

    • Candidato

    • Classe

    • Quadro

    • Candidato_pne

    • Padrao

  • Retificações de Aposentadoria
    • Tipo do Ato,

    • Tipo de Documento

    • Número do documento

    • Data do documento

    • Número do DODF

    • Data do DODF

    • Página do DODF

    • Nome do Servidor

    • Matrícula

    • Cargo

    • Classe

    • Padrao

    • Matricula SIAPE

    • Informação Errada

    • Informação Corrigida

  • Reversões
    • Processo_sei

    • Nome

    • Matricula

    • Cargo_efetivo

    • Classe

    • Padrao

    • Quadro

    • Fundamento_legal

    • Orgao

    • Vigencia

  • Substituições
    • Nome_substituto

    • Cargo_substituto

    • Matricula_substituto

    • Nome_substituido

    • Matricula_substituido

    • Simbolo_substitut

    • Cargo_objeto_substituicao

    • Simbolo_objeto_substituicao

    • Hierarquia_lotacao

    • Orgao

    • Data_inicial

    • Data_final

    • Matricula_siape

    • Motivo

  • Cessões
    • nome

    • matricula

    • cargo_efetivo

    • classe

    • padrao

    • orgao_cedente

    • orgao_cessionario

    • onus

    • fundamento legal

    • processo_SEI

    • vigencia

    • matricula_SIAPE

    • cargo_orgao_cessionario

    • simbolo

    • hierarquia_lotaca

  • Tornar sem efeito Aposentadoria
    • tipo_documento

    • numero_documento

    • data_documento

    • numero_dodf

    • data_dodf

    • pagina_dodf

    • nome

    • matricula

    • matricula_SIAPE

    • cargo_efetivo

    • classe

    • padrao

    • quadro

    • orgao

    • processo_SE