Now I am writing my data to rdbms using sqoop. So it store data to hdfs and then to rdbms. Is there any way to store rdd directly to Hive?
Asked
Active
Viewed 1,245 times
1 Answers
1
Yes you can write RDD to hive, One way to write RDD to hive is to convert RDD to Df and then saveTableAs() as below
import org.apache.spark.sql.hive.HiveContext
val hiveContext = new HiveContext(sc)
import hiveContext.implicits._
//read data perform some transformation
val myDF = myRdd.toDF("column names")
Then you can create a table and dump the data
myDF.write.saveAsTable("tableName")
//with save mode
myDF.write().mode(SaveMode.Overwrite).saveAsTable("tableName")
you can add save modes as above SaveModes are Append,Ignore,Overwrite,ErrorIfExists
koiralo
- 22,594
- 6
- 51
- 72