Explanation of the aggregate scala function -
i not understand yet aggregate function:
for example, having:
val x = list(1,2,3,4,5,6) val y = x.par.aggregate((0, 0))((x, y) => (x._1 + y, x._2 + 1), (x,y) => (x._1 + y._1, x._2 + y._2)) the result be: (21,6)
well, think (x,y) => (x._1 + y._1, x._2 + y._2) result in parallel, illustration (1 + 2, 1 + 1) , on.
but part leaves me confused:
(x, y) => (x._1 + y, x._2 + 1) why x._1 + y? , here x._2 0?
thanks in advance.
from documentation:
def aggregate[b](z: ⇒ b)(seqop: (b, a) ⇒ b, combop: (b, b) ⇒ b): b aggregates results of applying operator subsequent elements.
this more general form of fold , reduce. has similar semantics, not require result supertype of element type. traverses elements in different partitions sequentially, using seqop update result, , applies combop results different partitions. implementation of operation may operate on arbitrary number of collection partitions, combop may invoked arbitrary number of times.
for example, 1 might want process elements , produce set. in case, seqop process element , append list, while combop concatenate 2 lists different partitions together. initial value z empty set.
pc.aggregate(set[int]())(_ += process(_), _ ++ _)
another illustration calculating geometric mean collection of doubles (one typically require big doubles this). b type of accumulated results z initial value accumulated result of partition - typically neutral element seqop operator (e.g. nil list concatenation or 0 summation) , may evaluated more 1 time seqop operator used accumulate results within partition combop associative operator used combine results different partitions
in illustration b tuple2[int, int]. method seqop takes single element list, scoped y, , updates aggregate b (x._1 + y, x._2 + 1). increments sec element in tuple. puts sum of elements first element of tuple , number of elements sec element of tuple.
the method combop takes results each parallel execution thread , combines them. combination add-on provides same results if run on list sequentially.
using b tuple confusing piece of this. can break problem downwards 2 sub problems improve thought of doing. res0 first element in result tuple, , res1 sec element in result tuple.
// sums elements in parallel. scala> x.par.aggregate(0)((x, y) => x + y, (x, y) => x + y) res0: int = 21 // counts elements in parallel. scala> x.par.aggregate(0)((x, y) => x + 1, (x, y) => x + y) res1: int = 6 scala aggregate
No comments:
Post a Comment