I have the following working code:
def replaceNone(row):
myList = []
row_len = len(row)
for i in range(0, row_len):
if row[i] is None:
myList.append("")
else:
myList.append(row[i])
return myList
rdd_out = rdd_in.map(lambda row : replaceNone(row))
Here row is from pyspark.sql import Row
However, it is kind of lengthy and ugly. Is it possible to avoid making the replaceNone function by writing everything in the lambda process directly? Or at least simplify replaceNone()? Thanks!