pyspark.RDD.treeAggregate¶
-
RDD.
treeAggregate
(zeroValue: U, seqOp: Callable[[U, T], U], combOp: Callable[[U, U], U], depth: int = 2) → U[source]¶ Aggregates the elements of this RDD in a multi-level tree pattern.
New in version 1.3.0.
- Parameters
- zeroValueU
the initial value for the accumulated result of each partition
- seqOpfunction
a function used to accumulate results within a partition
- combOpfunction
an associative function used to combine results from different partitions
- depthint, optional, default 2
suggested depth of the tree
- Returns
- U
the aggregated result
See also
Examples
>>> add = lambda x, y: x + y >>> rdd = sc.parallelize([-5, -4, -3, -2, -1, 1, 2, 3, 4], 10) >>> rdd.treeAggregate(0, add, add) -5 >>> rdd.treeAggregate(0, add, add, 1) -5 >>> rdd.treeAggregate(0, add, add, 2) -5 >>> rdd.treeAggregate(0, add, add, 5) -5 >>> rdd.treeAggregate(0, add, add, 10) -5