Say I have three RDD transformation function called on rdd1:
def rdd2 = rdd1.f1
def rdd3 = rdd2.f2
def rdd4 = rdd3.f3
Now I want to cache rdd4, so I call rdd4.cache().
My question:
Will only the result from the action on rdd4 be cached or will every RDD above rdd4 be cached? Say I want to cache both rdd3 and rdd4, do I need to cache them separately?