Saving State of Data Pipelines
During the processing of data pipelines, the states of the computation process are saved and used by the pipelines. These states are crucial for ensuring the accuracy and integrity of the data processing operations. Jet’s snapshot allows you to save and restore these processing states.
A snapshot captures the state of a running Jet job at a particular point in time. It allows you to take a consistent record of the in-flight computations and processed data, which you can use for various purposes, such as fault tolerance, job migration, or analysis.
When the Jet engine takes a snapshot, all data in transit, and the internal state of the members processing the job, are recorded. This means that if the job fails or is restarted, it is restored to the state when the snapshot was taken. This helps to achieve fault-tolerant processing and ensuring data integrity.
To export a snapshot in Hazelcast Platform Operator, use the JetJobSnapshot
custom resource.
Configuring the JetJobSnapshot Resource
Configuration options for the JetJobSnapshot
custom resource.
Field | Description |
---|---|
|
Name of the exported snapshot. If empty, the name of the custom resource is used. You cannot modify this value after the object is created. |
|
Determines whether the job is canceled after exporting a snapshot. The default value is false. |
|
Name of the |
Exporting a Snapshot
Use the following example configuration to export a snapshot using the JetJobSnapshot
custom resource.
apiVersion: hazelcast.com/v1alpha1
kind: JetJobSnapshot
metadata:
name: jetjobsnapshot-sample
spec:
name: snapshot-example (1)
cancelJob: false
jetJobResourceName: jet-job-sample (2)
1 | Sets the name the exported snapshot. |
2 | Specifies the name of the JetJob CR object from which the snapshot is exported. |
You can only export a snapshot from a Jet job that has a status of Running .
|
Starting a Jet Job Initialized from the Snapshot
Use the following example to start a new Jet Job that is initialized from a snapshot.
apiVersion: hazelcast.com/v1alpha1
kind: JetJob
metadata:
name: jet-job-sample
spec:
name: my-test-jet-job
hazelcastResourceName: hazelcast
state: Running
initialSnapshotResourceName: jetjobsnapshot–sample (1)
jarName: jet-pipeline-1.0.2.jar
bucketConfig:
bucketURI: "gs://operator-user-code/jetJobs"
secretName: br-secret-gcp
1 | Specifies the name of the JetJobSnapshot CR object from which the Jet job is initialized. |