Batch Processing Resource¶
Data developers can use the Batch Processing service to process data through scheduling workflows. For more information, see Batch Data Processing Overview.
Resource Application Scenario¶
Two resources are required to run a batch data service: Batch Processing - Queue and Batch Processing - Container.
Batch Processing - Queue¶
When Hive and Spark interpreters are used to process data for offline analysis tasks, the interpreter uses the default queue resource for data query and processing, and tasks cannot be controlled and resources cannot be managed. If you need to run data query and processing tasks that require high resources, you need to apply for the Batch Processing - Queue resource and configure the resource name in the notebook.
Resource Specification¶
The Batch Processing - Queue resource can be requested based on the computing unit (CU). If the jobs require higher CPU usage, choose the Computing-Intensive specification. If the jobs require higher memory usage, choose the Memory-Intensive specification.
Specification |
Allocated Resources |
---|---|
Computing-Intensive |
1 CU = 1 Core CPU + 2 GB Memory. Available options are 2 - 5000 CU. |
Memory-Intensive |
1 CU = 1 Core CPU + 4 GB Memory. Available options are 4 - 5000 CU. |
Batch Processing - Container¶
To run big data analysis tasks using the batch processing service, you need to apply for the Batch Processing - Container resource.
Design Mode: When you need to use batch script development capabilities, you need to request designmode resources in advance.
Running Mode: When you need to use data synchronization or batch data processing capabilities, you need to request run-time mode resources when running manual or periodic scheduling tasks.
Note
The maximum number of resources that can be applied for each resource type is 1.
Resource Specification¶
The Batch Processing - Container resource is caluclated in CUs, and different specifications correspond to different processing capabilities. The higher the specification in the same resource mode, the higher the processing efficiency, and the larger the amount of data processed per unit time.
Design Mode: Container resources used for the execution and debugging of the corresponding script when developing functional modules for batch data processing scripts.
Runtime Mode: Container resources required by task nodes to run and schedule (both periodic and immediate) for batch data processing or data synchronization functions.
Design Mode Specification¶
Specification |
Description |
---|---|
CU |
1 CU = 1 Core CPU + 2 GB Memory. Available options are 1 - 5000 CU. |
Runtime Mode Specification¶
Specification |
Description |
---|---|
CU |
1 CU = 1 Core CPU + 2 GB Memory. Available options are 1 - 5000 CU. |