I have a schema of this form from a json file:
root
 |-- fruit_id: string (nullable = true)
 |-- fruit_type: array (nullable = true)
 |    |-- name: string (nullable = true)
 |    |-- info: struct (nullable = true)
 |         |-- fruit_quality: array (nullable = true)
 |         |    |-- quality: string (nullable = true)
 |         |-- likes: string (containsNull = true)    
 |    |-- finance: struct (nullable = true)
 |    |    |-- last_year_price: string (nullable = true)
 |    |    |-- current_price: string (nullable = true)
 |    |-- shops: struct (nullable = true)
 |    |    |-- shop1: string (nullable = true)
 |    |    |-- shop2: string (nullable = true)
 |-- season: string (nullable = true)
How can I get it of this form?
root
 |-- fruit_id: string (nullable = true)
 |-- fruit_type_name: string (nullable = true)
 |-- fruit_type_info_fruit_quality_quality: string (nullable = true)
 |-- fruit_type_info_likes: string (nullable = true)
 |-- fruit_type_finance_last_year_price: string (nullable = true)
 |-- fruit_type_finance_current_price: string (nullable = true)
 |-- fruit_type_shops_shop1: string (nullable = true)
 |-- fruit_type_shops_shop2: string (nullable = true)
 |-- season: string (nullable = true)
This is for the case of fruits. How would I flatten it similar way if I receive a file with info on vegetables ?
I am facing issue while flattening the array part. I am able to flatten structs inside structs, I followed this: link
I also added this piece of code to code on above link, to see if this approach would work:
import pyspark.sql.functions as F
 array_cols = [c[0] for c in df.dtypes if c[1][:6] == 'array']
 df = df.select(
                               [F.col(nc+'.'+c).alias(nc+'_'+c)
                                for nc in array_cols
                                for c in df.select(nc+'.*').columns])
But it's not working.
I then checked this link as well: link
But here issue is if I want to flatten the json file of fruits, It is possible, but then if I send a json file of vegetables with similar schema, I'll have to redefine the code.
Another approach I went for was converting an array to struct & then I could use the flatten the nested structs, but that wasn't helpful.
Lastly, I checked this link as well: link
But this approach threw an error, saying flattening not possible, since I have array of structs & not an array of array.
So how can I solve this?