scala - How to flatMap a function on GroupedDataSet in Apache Flink -
i want apply function via flatmap
each group produced dataset.groupby
. trying call flatmap
compiler error:
error: value flatmap not member of org.apache.flink.api.scala.groupeddataset
my code:
var mapped = env.fromcollection(array[(int, int)]()) var groups = mapped.groupby("mygroupfield") groups.flatmap( myfunction: (int, array[int]) => array[(int, array[(int, int)])] ) // error: groupeddataset has no member flatmap
indeed, in documentation of flink-scala 0.9-snapshot no map
or similar listed. there similar method work with? how achieve desired distributed mapping on each group individually on node?
you can use reducegroup(groupreducefunction f)
process elements group. groupreducefunction
gives iterable
on elements of group , collector
emit arbitrary number of elements.
flink's groupby()
function not group multiple elements single element, i.e., not convert group of (int, int)
elements (that share same _1
tuple field) 1 (int, array[int])
. instead, dataset[(int, int)]
logically grouped such elements have same key can processed together. when apply groupreducefunction
on groupeddataset
, function called once each group. in each call elements of group handed function. function can process elements of group , convert group of (int, int)
elements single (int, array[int])
element.
Comments
Post a Comment