I have the following RDD and many just like it:
val csv = sc.parallelize(Array(
  "col1, col2, col3",
  "1, cat, dog",
  "2, bird, bee"))
I would like to convert the RDD into a dataframe where the schema is created dynamically/programmatically based on the first row of the RDD.
I would like to apply the logic to multiple similar like RDDs and cannot specify the schema programmatically using a case class nor use spark-csv to load the data in as a dataframe from the start.
I've created a flattened dataframe, but am wondering how to breakout the respective columns when creating the dataframe?
Current code:
val header= file.first()
val data = file.mapPartitionsWithIndex {
  (idx, iter) => if (idx == 0) iter.drop(1) else iter 
}.toDF(header).show()
Current output:
+----------------+
|col1, col2, col3|
+----------------+
|     1, cat, dog|
|    2, bird, bee|
+----------------+
 
     
     
    