Given a set U, which is stored in RDD named rdd.
What is the recommended way to merge any given RDD rdd_not_set and rdd such that the resultant rdd is also an set.
rdd = sc.union([rdd, U])
rdd = rdd.reduceBykey(reduce_func)
Ex: rdd = sc.parallelize([(1,2), (2,3)]) and rdd_not_set = sc.parallelize([(1,4), (3,4)]) and resultant final_rdd = sc.parallelize([(1,4), (2,3), (3,4)])
Naive solution is to perform union and then reduceByKey which would be very inefficient as rdd will be huge in size.
