Pipe groupByKey data in Apache Spark -
i transform next info set external script piped apache spark:
key,val1,val2 1,a,b 1,c,d 1,e,f 2,g,h 2,i,j 2,k,l
data should first grouped key , values passed external script using pipe()
i tried code, calls script 1 time , passes info it:
data.map(s => s.split(",")).map(a => (a(1),a)).groupbykey().pipe(seq(sparkfiles.get("test.sh")))
apache pipe apache-spark
No comments:
Post a Comment