Program Operators


The MI Pipelines provides the following operators related to task processing:

  • Notebook Operator

  • Python Operator

  • Shell Operator

  • Email Operator

  • Notebookex Operator

  • Pythonex Operator

  • Shellex Operator

  • Pipeline Trigger Operator

  • APIM Operator

  • ParallelFor Status List Operator

Notebook Operator

The Notebook operator is often used to process ipynb-type tasks that have been verified and saved in Notebook. The Notebook operator is often used in combination with the Git Directory operator. Usually, the developed model code files will be saved onto Git, and the Notebook operator can be used to get the code files from the Git Directory operator and run them. The typical scenario is to run Python tasks, run Python code files, and train machine learning models, and the generated model files will be recorded and exported through the logmodel method of MLflow.


The input and output parameters for the Notebook operator can be configured or sorted dynamically based on business needs.

Input Parameters Description

The following table shows the commonly used input parameters.

Name

Required/Optional

Type

Description

workspace

Required

Directory

Specify the file directory where the code is located, which usually comes from the directory specified by the Git Directory operator.

entrypoint

Required

String

Specify the name of the entry program file, which should include the path (because there may be files with the same name in different directories).

requirements_file_path

Optional

String

Specify the file path of the dependent package to be installed.

env

Optional

List

Specify the list of parameters to be passed.

Output Parameters Description

The following table shows the commonly used output parameters.

Name

Type

Description

mlflow_model_file_paths

List

List of model file paths recorded and exported by MLflow’s logmodel method.

Python Operator

The Python operator is used to handle Python script tasks and often used in combination with the Git Directory operator. The parameters of the Python operator are divided into fixed parameters and dynamic parameters. The fixed parameters cannot be deleted, while the dynamic parameters can be added, modified, deleted, or sorted based on your needs.

Input Parameters Description

The table below lists the fixed parameters of the Python operator.

Name

Required/Optional

Type

Description

workspace

Required

Directory

Specify the file directory where the code is located, which usually comes from the Git Directory operator.

entrypoint

Required

String

Specify the name of the entry program file, which should include the path.

requirements_file_path

Optional

String

Specify the file path of the dependent package to be installed.

Output Parameters Description

The Python operator does not have fixed output parameters, you can add parameters based on your needs.

Shell Operator

The Shell operator is used to process Shell script tasks, and its input and output parameters are the same as those of Python operator. See Python operator descriptions.

Email Operator

The Email operator is used to send email notifications.

Input Parameters Description

Name

Required/Optional

Type

Description

mail_host

Required

String

IP address or domain name of the email server; e.g. smtp.163.com and smtp.office365.com.

mail_user

Required

String

Email service user name.

mail_pass

Optional

Password

Password corresponding to the user name.

sender

Required

String

Sender.

receivers

Required

List

Recipient list, which can be derived from the user list in the organization.

content

Required

String

Content of the sent email.

subject

Required

String

Subject of the sent email.

on_condition

Optional

Run_status

When the specified value is succeed, completed, or failed, this operator can be used as an exit operator. Once the running status of the pipeline matches with the specified value, the email sending will be triggered.

Output Parameters Description

Name

Type

Description

status

String

Email sending status.

content_out

String

Email content.

NotebookEx Operator

The NotebookEx operator is often used to process ipynb-type tasks that have been verified and saved in Notebook. The Notebook operator is used to get the developed model code files form the internal storage and run them. The typical scenario is to run Python tasks, run Python code files, and train machine learning models, and the generated model files will be recorded and exported through the logmodel method of MLflow. For more information about uploading code files to the internal storage, see Uploading Model Code Files to the Internal Storage.


The input and output parameters for the NotebookEx operator can be configured or sorted dynamically based on business needs.

Input Parameters Description

The following table shows the commonly used input parameters.

Name

Required/Optional

Type

Description

workspace

Required

notebook_dir

Specify the file directory.

entrypoint

Required

notebook_file

Specify the name of the entry program file, which should include the path (because there may be files with the same name in different directories).

requirements

Optional

notebook_file

Specify the dependent package to be installed.

env

Optional

List

Specify the list of parameters to be passed.

Output Parameters Description

The following table shows the commonly used output parameters.

Name

Type

Description

mlflow_model_file_paths

List

List of model file paths recorded and exported by MLflow’s logmodel method.

PythonEx Operator

The PythonEx operator is used to handle Python script tasks that are saved in the internal storage. The parameters of the Python operator are divided into fixed parameters and dynamic parameters. The fixed parameters cannot be deleted, while the dynamic parameters can be added, modified, deleted, or sorted based on your needs.

Input Parameters Description

The following table shows the commonly used input parameters.

Name

Required/Optional

Type

Description

workspace

Required

notebook_dir

Specify the file directory.

entrypoint

Required

notebook_file

Specify the name of the entry program file, which should include the path (because there may be files with the same name in different directories).

requirements

Optional

notebook_file

Specify the dependent package to be installed.

Output Parameters Description

PythonEx operator does not have fixed output parameters, you can add parameters based on your needs.

ShellEx Operator

ShellEx operator is used to process Shell script tasks stored at the internal storage. Its input and output parameters are the same as those of PythonEx operator. See PythonEx operator descriptions.0

Pipeline Trigger Operator

Pipeline Trigger operator can only be used as an OnExit operator to trigger another pipeline under the same OU.

Input Parameters Description

Name

Required/Optional

Type

Description

experiment

Required

pipeline_experiment

Select the pipeline to be triggered.

Output Parameters Description

Name

Type

Description

pipeline_run_id

String

The instance name of the triggered pipeline.

pipeline_run_info

String

The running information of the triggered pipeline.

APIM Operator

APIM operator is used to call a specified API from APIM.

Input Parameters Description

Name

Required/Optional

Type

Description

url

Required

String

Enter the address of the API to be called.

access_key

Required

String

Enter the AccessKey of the API.

secret_key

Required

password

Enter the SecretKey of the API. The SecretKey is hidden when you go to the pipeline design page and view this operator again after you enter or modify the value and save the pipeline.

http_method

Required

http_method

Select an HTTP method. Values include GET, POST, PUT, and DELETE.

body

Optional

String

Enter the request body.

headers

Optional

String

Enter the request header.

Output Parameters Description

Name

Type

Description

result

File

Results of the API call.

ParallelFor Status List Operator

ParallelFor Status List operator is used to get results for each item of a ParallelFor operator.

Input Parameters Description

Name

Required/Optional

Type

Description

run_id

Required

String

Enter the instance name, you can use instances under this OU.

parallelfor_path

Required

String

Enter the path of the ParallelFor operator.

Output Parameters Description

Name

Type

Description

result

List

Result information