I am working with DataFrames which elements have got a schema similar to:
root
 |-- NPAData: struct (nullable = true)
 |    |-- NPADetails: struct (nullable = true)
 |    |    |-- location: string (nullable = true)
 |    |    |-- manager: string (nullable = true)
 |    |-- service: array (nullable = true)
 |    |    |-- element: struct (containsNull = true)
 |    |    |    |-- serviceName: string (nullable = true)
 |    |    |    |-- serviceCode: string (nullable = true) 
 |-- NPAHeader: struct (nullable = true)
 |    |    |-- npaNumber: string (nullable = true)
 |    |    |-- date: string (nullable = true)
In my DataFrame I want to group all elements which has the same NPAHeader.code, so to do that I am using the following line:
val groupedNpa = orderedNpa.groupBy($"NPAHeader.code" ).agg(collect_list(struct($"NPAData",$"NPAHeader")).as("npa"))
After this I have a dataframe with the following schema:
StructType(StructField(npaNumber,StringType,true), StructField(npa,ArrayType(StructType(StructField(NPAData...)))))
An example of each Row would be something similar to:
[1234,WrappedArray([npaNew,npaOlder,...npaOldest])]
Now what I want is to generate another DataFrame with picks up just one of the element in the WrappedArray, so I want an output similar to:
[1234,npaNew]
Note: The chosen element from the WrappedArray is the one that matches a complext logic after iterating over the whole WrappedArray. But to simplify the question, I will pick up always the last element of the WrappedArray (after iterating all over it).
To do so, I want to define a recurside udf
import org.apache.spark.sql.functions.udf
def returnRow(elementList : Row)(index:Int): Row = {
  val dif = elementList.size - index
  val row :Row = dif match{
    case 0 => elementList.getAs[Row](index)
    case _ => returnRow(elementList)(index + 1) 
  }
  row
} 
val returnRow_udf = udf(returnRow _)
groupedNpa.map{row => (row.getAs[String]("npaNumber"),returnRow_udf(groupedNpa("npa")(0)))}
But I am getting the following error in the map:
Exception in thread "main" java.lang.UnsupportedOperationException: Schema for type Int => Unit is not supported
What am I doing wrong?
As an aside, I am not sure if I am passing correctly the npa column with groupedNpa("npa"). I am accesing the WrappedArray as a Row, because I don't know how to iterate over Array[Row] (the get(index) method is not present in Array[Row])
 
    