I have created the dataframe and the input is like this:
   +-----------------------------------+
   |value                              |
   +-----------------------------------+
   |1   PRE123                    21   |
   |2   TEST                      32   |
   |7   XYZ                       .7   |
   +-----------------------------------+
and on the basis on the below metadata information we need to split the above data frame and create a new dataframe, having columns name id,name and class and it start and index loction is given in this json meta data.
   {
    "columnName": "id",
    "start": 1,
    "end": 2
  },
  {
    "columnName": "name",
    "start": 5,
    "end": 10
  },
  {
    "columnName": "class",
    "start": 20,
    "end": 22
  }
OUTPUT :
  +---+------+-----+
  | id|  name|class|
  +---+------+-----+
  |  1|PRE123|   21|
  |  2|  TEST|   32|
  |  7|   XYZ|   .7|
  +---+------+-----+
For loading the df, I have created the list:
   list.+=(loadedDF.col("value").substr(fixedLength.getStart, (fixedLength.getEnd - fixedLength.getStart)).alias(fixedLength.getColumnName))
and from this list, I have created the dataframe
var df: DataFrame = loadedDF.select(list: _*)
Need to know the order better approach for creating the dataframe from the metadata. As the list created will bring all the data to the driver node.
 
    