I have an collect data of dataFrame column in spark
temp = df.select('item_code').collect()
Result: 
[Row(item_code=u'I0938'),
 Row(item_code=u'I0009'),
 Row(item_code=u'I0010'),
 Row(item_code=u'I0010'),
 Row(item_code=u'C0723'),
 Row(item_code=u'I1097'),
 Row(item_code=u'C0117'),
 Row(item_code=u'I0009'),
 Row(item_code=u'I0009'),
 Row(item_code=u'I0009'),
 Row(item_code=u'I0010'),
 Row(item_code=u'I0009'),
 Row(item_code=u'C0117'),
 Row(item_code=u'I0009'),
 Row(item_code=u'I0596')]
And now i would like assign a number for each word, if words is duplicate, it have the same number. I using Spark, RDD , not Pandas
Please help me resolve this problem!
 
    