:: Experimental ::
Abstract class for getting and updating the state in mapping function used in the mapWithState
operation of a pair DStream (Scala)
or a JavaPairDStream (Java).
:: Experimental ::
Abstract class for getting and updating the state in mapping function used in the mapWithState
operation of a pair DStream (Scala)
or a JavaPairDStream (Java).
Scala example of using State
:
// A mapping function that maintains an integer state and returns a String def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = { // Check if state exists if (state.exists) { val existingState = state.get // Get the existing state val shouldRemove = ... // Decide whether to remove the state if (shouldRemove) { state.remove() // Remove the state } else { val newState = ... state.update(newState) // Set the new state } } else { val initialState = ... state.update(initialState) // Set the initial state } ... // return something }
Java example of using State
:
// A mapping function that maintains an integer state and returns a String Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction = new Function3<String, Optional<Integer>, State<Integer>, String>() { @Override public String call(String key, Optional<Integer> value, State<Integer> state) { if (state.exists()) { int existingState = state.get(); // Get the existing state boolean shouldRemove = ...; // Decide whether to remove the state if (shouldRemove) { state.remove(); // Remove the state } else { int newState = ...; state.update(newState); // Set the new state } } else { int initialState = ...; // Set the initial state state.update(initialState); } // return something } };
Class of the state
:: Experimental ::
Abstract class representing all the specifications of the DStream transformation
mapWithState
operation of a
pair DStream (Scala) or a
JavaPairDStream (Java).
:: Experimental ::
Abstract class representing all the specifications of the DStream transformation
mapWithState
operation of a
pair DStream (Scala) or a
JavaPairDStream (Java).
Use org.apache.spark.streaming.StateSpec.function()
factory methods
to create instances of this class.
Example in Scala:
// A mapping function that maintains an integer state and return a String def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = { // Use state.exists(), state.get(), state.update() and state.remove() // to manage state, and return the necessary string } val spec = StateSpec.function(mappingFunction).numPartitions(10) val mapWithStateDStream = keyValueDStream.mapWithState[StateType, MappedType](spec)
Example in Java:
// A mapping function that maintains an integer state and return a string Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction = new Function3<String, Optional<Integer>, State<Integer>, String>() { @Override public Optional<String> call(Optional<Integer> value, State<Integer> state) { // Use state.exists(), state.get(), state.update() and state.remove() // to manage state, and return the necessary string } }; JavaMapWithStateDStream<String, Integer, Integer, String> mapWithStateDStream = keyValueDStream.mapWithState(StateSpec.function(mappingFunc));
Class of the state key
Class of the state value
Class of the state data
Class of the mapped elements
Main entry point for Spark Streaming functionality.
Main entry point for Spark Streaming functionality. It provides methods used to create
org.apache.spark.streaming.dstream.DStreams from various input sources. It can be either
created by providing a Spark master URL and an appName, or from a org.apache.spark.SparkConf
configuration (see core Spark documentation), or from an existing org.apache.spark.SparkContext.
The associated SparkContext can be accessed using context.sparkContext
. After
creating and transforming DStreams, the streaming computation can be started and stopped
using context.start()
and context.stop()
, respectively.
context.awaitTermination()
allows the current thread to wait for the termination
of the context by stop()
or by an exception.
This is a simple class that represents an absolute instant of time.
This is a simple class that represents an absolute instant of time. Internally, it represents time as the difference, measured in milliseconds, between the current time and midnight, January 1, 1970 UTC. This is the same format as what is returned by System.currentTimeMillis.
Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of milliseconds.
Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of minutes.
Helper object that creates instance of org.apache.spark.streaming.Duration representing a given number of seconds.
:: Experimental ::
Builder object for creating instances of org.apache.spark.streaming.StateSpec
that is used for specifying the parameters of the DStream transformation mapWithState
that is used for specifying the parameters of the DStream transformation
mapWithState
operation of a
pair DStream (Scala) or a
JavaPairDStream (Java).
:: Experimental ::
Builder object for creating instances of org.apache.spark.streaming.StateSpec
that is used for specifying the parameters of the DStream transformation mapWithState
that is used for specifying the parameters of the DStream transformation
mapWithState
operation of a
pair DStream (Scala) or a
JavaPairDStream (Java).
Example in Scala:
// A mapping function that maintains an integer state and return a String def mappingFunction(key: String, value: Option[Int], state: State[Int]): Option[String] = { // Use state.exists(), state.get(), state.update() and state.remove() // to manage state, and return the necessary string } val spec = StateSpec.function(mappingFunction).numPartitions(10) val mapWithStateDStream = keyValueDStream.mapWithState[StateType, MappedType](spec)
Example in Java:
// A mapping function that maintains an integer state and return a string Function3<String, Optional<Integer>, State<Integer>, String> mappingFunction = new Function3<String, Optional<Integer>, State<Integer>, String>() { @Override public Optional<String> call(Optional<Integer> value, State<Integer> state) { // Use state.exists(), state.get(), state.update() and state.remove() // to manage state, and return the necessary string } }; JavaMapWithStateDStream<String, Integer, Integer, String> mapWithStateDStream = keyValueDStream.mapWithState(StateSpec.function(mappingFunc));
StreamingContext object contains a number of utility functions related to the StreamingContext class.
Various implementations of DStream's.
Various implementations of DStream's.
Spark Streaming functionality. org.apache.spark.streaming.StreamingContext serves as the main entry point to Spark Streaming, while org.apache.spark.streaming.dstream.DStream is the data type representing a continuous sequence of RDDs, representing a continuous stream of data.
In addition, org.apache.spark.streaming.dstream.PairDStreamFunctions contains operations available only on DStreams of key-value pairs, such as
groupByKey
andreduceByKey
. These operations are automatically available on any DStream of the right type (e.g. DStream[(Int, Int)] through implicit conversions.For the Java API of Spark Streaming, take a look at the org.apache.spark.streaming.api.java.JavaStreamingContext which serves as the entry point, and the org.apache.spark.streaming.api.java.JavaDStream and the org.apache.spark.streaming.api.java.JavaPairDStream which have the DStream functionality.