I'm a beginner with Spark, I have Avro records in the dataset and I'm getting the DataSet created from with those records.
DataDataset<Row> ds = spark.read().format("com.databricks.spark.avro)
.option("avroSchema,schema.toString().load(./*.avro);
One of my column values looks like
+--------------------------+
|           col1           |
| VCE_B_WSI_20180914_573   |
| WCE_C_RTI_20181223_324   |
---------------------------+  
I would want to split this column multiple columns and would like to group by on this new columns, like below
+------------------+
|col1  |col2|col3  |
|   VCE|   B|   WSI|
|   WCE|   C|   RTI|
+------------------+
I would really appreciate any tips on how should I go about doing this? Should I convert the dataset to RDD and apply these transformations but i'm not sure if i can add new columns in RDD.