I'd like to know equivalence in PySpark to the use of reset_index() command used in pandas. When using the default command (reset_index), as follows:
data.reset_index()
I get an error:
"DataFrame' object has no attribute 'reset_index' error"
I'd like to know equivalence in PySpark to the use of reset_index() command used in pandas. When using the default command (reset_index), as follows:
data.reset_index()
I get an error:
"DataFrame' object has no attribute 'reset_index' error"
Like the other comments mentioned, if you do need to add an index to your DF, you can use:
from pyspark.sql.functions import monotonically_increasing_id
df = df.withColumn("index_column",monotonically_increasing_id())