scala - How to find max value in pair RDD? -
i have spark pair rdd (key, count) below
array[(string, int)] = array((a,1), (b,2), (c,1), (d,3))
how find key highest count using spark scala api?
edit: datatype of pair rdd org.apache.spark.rdd.rdd[(string, int)]
use array.maxby
method:
val = array(("a",1), ("b",2), ("c",1), ("d",3)) val maxkey = a.maxby(_._2) // maxkey: (string, int) = (d,3)
or rdd.max
:
val maxkey2 = rdd.max()(new ordering[tuple2[string, int]]() { override def compare(x: (string, int), y: (string, int)): int = ordering[int].compare(x._2, y._2) })
Comments
Post a Comment