Partitioner


This stage partitions the records of each batch by the specified conditions and is usually used for joint operations among different points of the same device, or operations among points of different devices. The functions of this stage include:

  • Supporting customized partitioning conditions.

  • Supporting path expressions in customized partitioning conditions. For example, /assetId indicates partitioning by the value of the assetId field in each record.


This stage does not support streaming lineage.

Configuration

The configuration tabs for this stage are General and Partitioner.

General

Name

Required?

Description

Name

Yes

The name of the stage.

Description

No

The description of the stage.

Stage Library

Yes

The streaming operator library to which the stage belongs.

Required Fields

No

The fields that the data records must contain. If the specified fields are not included, the record will be filtered out.

Preconditions

No

The conditions that must be satisfied by the data records. Records that do not meet the conditions will be filtered out. For example, ${record:value('/value') > 0}. For the syntax of EL expressions, see `Expression Language https://docs.streamsets.com/portal/datacollector/latest/help/datacollector/UserGuide/Expression_Language/ExpressionLanguage_overview.html>`__.

On Record Error

Yes

The processing method for error data.

  • Discard: Error data will be discarded and ignored

  • Send to Error: Error messages will be reported

  • Stop Pipeline: The pipeline will be stopped

Partitioner

Name

Required?

Description

Parallelism (Standalone Mode Only)

Yes

Specify the number of partitions to create per batch of records in Standalone mode.

Application Name (Standalone Mode Only)

Yes

The name of the application that runs this stage in Standalone mode (SDC Spark App by default).

Init Method Arguments

Yes

Specify the record fields used for creating partitions. For example, specifying /assetId means assigning records with the same assetId to the same partition.

Output Results

The output results of this stage are the records after partitioning.

Output Example

../../../_images/partitioner_result1.png