Suppose I have a DataFrame x with this schema:
xSchema = StructType([ \
StructField("a", DoubleType(), True), \
StructField("b", DoubleType(), True), \
StructField("c", DoubleType(), True)])
I then have the DataFrame:
DataFrame[a :double, b:double, c:double]
I would like to have an integer derived column. I am able to create a boolean column:
x = x.withColumn('y', (x.a-x.b)/x.c > 1)
My new schema is:
DataFrame[a :double, b:double, c:double, y: boolean]
However, I would like column y to contain 0 for False and 1 for True.
The cast function can only operate on a column and not a DataFrame and the withColumn function can only operate on a DataFrame. How to I add a new column and cast it to integer at the same time?