I've been searching for a while if there is any way to use a Scala class in Pyspark, and I haven't found any documentation nor guide about this subject.
Let's say I create a simple class in Scala that uses some libraries of apache-spark, something like:
class SimpleClass(sqlContext: SQLContext, df: DataFrame, column: String) {
  def exe(): DataFrame = {
    import sqlContext.implicits._
    df.select(col(column))
  }
}
- Is there any possible way to use this class in Pyspark?
- Is it too tough?
- Do I have to create a .pyfile?
- Is there any guide that shows how to do that?
By the way I also looked at the spark code and I felt a bit lost, and I was incapable of replicating their functionality for my own purpose.
 
     
     
    