Me using spark-sql for data migration project. So how should I implement stage area in spark ? when to use spark sql cache or persists? any real time use cases ?
~Sha
Me using spark-sql for data migration project. So how should I implement stage area in spark ? when to use spark sql cache or persists? any real time use cases ?
~Sha
Similarly to RDD (What is the difference between cache and persist?) the only difference between cache and persist is ability to set non-default storage mode.
There is one important difference though. Unlike in RDD API, where cache uses MEMORY_ONLY, Dataset counterpart uses MEMORY_AND_DISK.