Processing Operators


The MI Pipelines provides the following operators related to task processing:

  • Notebook Operator

  • Python Operator

  • Shell Operator

  • Email Operator

Notebook Operator

The Notebook operator is often used to process ipynb-type tasks that have been verified and saved in Notebook. Usually, the developed model code files will be saved onto Git, and the Notebook operator can be used to get the code file from the Git Directory operator and execute it. Its typical scenario is to execute Python tasks, run Python code files, and train machine learning models, and the generated model files will be recorded and outputted through the logmodel method of MLflow.


The Notebook operator is often used in combination with the Git Directory operator. For example:

../_images/notebook_calculator.png

Input Parameters Description

Name

Required/optional

Type

Description

workspace

Required

Directory

Specify the file directory where the code is located, which usually comes from the directory specified by the Git Directory operator.

entrypoint

Required

String

Specify the name of the entry program file, which should include the path (because there may be files with the same name in different directories).

requirements_file_path

Optional

String

Specify the file path of the dependent package to be installed.

env

Optional

List

Specify the list of parameters to be passed.

Output parameters description

Name

Type

Description

mlflow_model_file_paths

List

List of model file paths recorded and outputted by MLflow’s logmodel method.

Python Operator

The Python operator is used to handle Python script tasks. The parameters of the Python operator are divided into fixed parameters and dynamic parameters. The fixed parameters cannot be deleted, while the dynamic parameters can be added, modified or deleted according to the usage needs.

Input Parameters Description

The table below lists the fixed parameters of the Python operator.

Name

Required/optional

Type

Description

workspace

Required

Directory

Specify the file directory where the code is located, which usually comes from the Git Directory operator.

entrypoint

Required

String

Specify the name of the entry program file, which should include the path.

requirements_file_path

Optional

String

Specify the file path of the dependent package to be installed.

Output parameters description

The Python operator has no fixed output parameters, and its output parameters can be dynamically increased according to the usage needs.

Shell Operator

The Shell operator is used to process Shell script tasks, and its input and output parameters are the same as those of Python operator. See the documentation of Python operator.

Email Operator

The Email operator is used to achieve the email alert function.

Input Parameters Description

Name

Required/optional

Type

Description

mail_host

Required

String

IP address or domain name of the email server; e.g. smtp.163.com and smtp.office365.com.

mail_user

Required

String

Email service user name.

mail_pass

Required

Password

Password corresponding to the user name.

sender

Required

String

Sender.

receivers

Required

List

Recipient list, which can be derived from the user list in the organization.

content

Required

String

Content of the sent email.

subject

Required

String

Subject of the sent email.

Output parameters description

Name

Type

Description

status

String

Email sending status.

content_out

String

Email content.