About EnOS Stream Analytics¶
Powered by Apache Spark™ Streaming with Envision customization and optimization, the EnOS Stream Analytics engine offers high scalability, high throughput, and high fault-tolerance.
Broadly speaking, the generation of data can be considered as a series of discrete events. When drawing these discrete events on a time axis, an event stream or data stream is formed. Stream data consist of these endless event streams. Stream data is generated continuously by a lot of data sources. However, the size of the stream data is normally smaller than that of the offline data. The common sources of stream data can be the devices connected to a data center, the telemetry data of devices, and the log files generated by mobile or web applications.
EnOS Stream Analytics can be used in the following scenarios:
Aggregating and calculating asset raw data
In most business scenarios, you might need to filter the raw data received from devices, aggregate the data by a certain algorithm, and save the aggregated data for further processing.
Computation of device states
In some business scenarios, you might need to obtain certain state parameters of a device to confirm its status. The stream analytics framework maintains the state of devices and sites internally. The system updates both the measuring point values and the device connection status to the latest state.
Features¶
Real-time and unbounded stream data processing
The computation the engine processes is real-time and streaming, and the data streams are subscribed and consumed by stream computing in chronological order. Because the data is generated continuously, the data streams are integrated to the streaming system continuously. For example, website access log is a type of stream data, the log continuously records data as long as the website is on line. Thus, the stream data is always real-time and unbounded.
Continuous and efficient computation
The computation models of stream computing are “event triggered”. The trigger is the unbounded stream data mentioned in the previous section. Once new stream data is sent to the system, the system immediately initiates and performs a computation task. Therefore, stream computing is a continuous process.
Stream and real-time data integration
The result of stream computing triggered by stream data is recorded continuously into the destination data storage, based on the configured data storage policies.
Data Processing Flow¶
The procedure of EnOS Stream Analytics is as follows:
Processing of raw data
Original measuring point data is sent to Kafka through the EnOS connection layer. The messages received are analyzed by the stream computing process of measuring points. Before processing, the data is filtered by the specified threshold. Data exceeding the threshold will be processed by interpolation algorithm.
Performing calculation
In this step, data is computed by the defined algorithm in the processing strategy.
Output of computation
The data from the streaming module flows into In-memory Database (IMDB) and Kafka, and the downstream continues to subscribe all data from Kafka and record them to time series database (TSDB) or other storage systems. The stored data can be used as the data source for offline data processing and further processing and analysis.
Data Processing Templates¶
EnOS Stream Analytics provides the following templates for developers to configure stream data processing jobs quickly:
- Time Window Aggregation
- Multi-Point Merging
- Electric Energy Cal by Meter Reading
- Electric Energy Cal by Instant Power
- Electric Energy Cal by Average Power
StreamSets Operators¶
EnOS Stream Analytics provides a set of underlying packaged StreamSets operators for developers to design customized stream data processing jobs to meet the requirements of complex business scenarios. For more information, see StreamSets Operator Reference.