I would like to replicate the Pandas nunique function with Spark SQL and DataFrame. I have the following:
%spark
import org.apache.spark.sql.functions.countDistinct
import org.apache.spark.sql.functions._ 
val df = spark.read
        .format("csv")
        .option("delimiter", ";")
        .option("header", "true") //first line in file has headers
        .load("target/youtube_videos.csv")
        
println("Distinct Count: " + df.distinct().count())
        
val df2 = df.select(countDistinct("likes"))
df2.show(false)
This works and prints the unique count for the likes column as below:
Distinct Count: 109847
+---------------------+
|count(DISTINCT likes)|
+---------------------+
|27494                |
+---------------------+
How can I do this in one SQL so that I can get a summary of all the individual columns?
 
    