I have in python a Spark DataFrame with nested columns, and I have the path a.b.c, and want to check if there is a nested column after c called d, so if a.b.c.d exists.
Simply checking df.columns['a']['b']['c']['d'] or df.columns['a.b.c.d'] doesn't seem to work, so I found that the df.schema function can be used.
So I just iterate through e.g.:
y = df.schema['a'].dataType['b'].dataType['c'].dataType
and then should normally check if d is in y.
The way I did it is simply try y['d'], and if it fails, then it doesn't exist.
But I don't think using try is the best way.
So I tried checking if 'd' in y, but apparently this doesn't work, although retrieving the element y['d'] works if it exists.
The type of y is StructType(List(StructField(d,StringType,true),...other columns))
So I don't really know how to properly check if d is in y. Why can't I directly check if 'd' in y when I can retrieve y['d']? Can anyone help? I'm also new in python, but I can't find or think of another solution.