I have a small df that consists of two columns with a description and a value:
 description|               value|
+--------------+--------------------+
|   PED_tobacco|                 0.4|
|PED_nontobacco|                1.49|
|           GMI|    17590.8855333196|
|       CMO_NGP|             53389.0|
|             A|                80.3|
|         SC_TT|              -0.146|
|        SC_THP|              -0.056|
|       SC_ENDS|              -0.007|
|      SC_CF_PD|              -0.002|
|      SC_CF_FF|              -0.031|
|      CO2_comb|             1.23E-6|
|   CO2_lighter|2.083000000000000...|
|   Carbon_Cost|               114.0|
|     PR_SDG12A|               -0.05|
|     PR_SDG12B|               -0.01|
|       PR_SDG3|                 0.0|
|      PR_SDG14|               -0.27|
|EDEVICE_SDG12A|               -0.01|
|EDEVICE_SDG12B|               -0.05|
|  EDEVICE_SDG3|               -0.01|
+--------------+--------------------+
I have been trying to find a way to convert each row, in an independent defined variable, so that I can reference it directly. For example, I want to be able to say PED_tobacco * 10, and get back 40.
I tried converting it into a list of dictionaries (at least that's how I can explain it with my python background), using:
ass_dict = df_assumptions \
    .rdd \
    .map(lambda row: {row[0]: row[1]}) \
    .collect()
# Which prints:
{'PED_tobacco': 0.4}, {'PED_nontobacco': 1.49}, {'GMI': 17590.8855333196}, {'CMO_NGP': 53389.0}, {'A': 80.3}, {'SC_TT': -0.146}, {'SC_THP': -0.056}, {'SC_ENDS': -0.007}, {'SC_CF_PD': -0.002}, {'SC_CF_FF': -0.031}, {'CO2_comb': 1.23e-06}, {'CO2_lighter': 2.0830000000000002e-08}, {'Carbon_Cost': 114.0}, {'PR_SDG12A': -0.05}, {'PR_SDG12B': -0.01}, {'PR_SDG3': 0.0}, {'PR_SDG14': -0.27}, {'EDEVICE_SDG12A': -0.01}, {'EDEVICE_SDG12B': -0.05}, {'EDEVICE_SDG3': -0.01}, {'EDEVICE_SDG14': 0.0}, {'TL_GL': 1.0}, {'TL_GR': 0.0}, {'EW_GL': 0.83}]
But I still can't access each variable independently them. In python I do this using:
def convert_to_var(df):
    desc = []
    val = []  
    
    for i,row in df.iterrows():
        desc.append(i)
        val.append(row) 
        
    return dict(val)
val_dict = convert_to_var(IA)
globals().update(val_dict)
Is there a way to do the same in Spark? How can I get each description with it's a value as a separate variable to be called on directly? Thanks in advance.
 
    