2

I had problem with registerTempTable after creating data frame. What can be the possible reason? Thanks.

import org.apache.spark.sql.SQLContext
val sqlContext = new SQLContext(sc)
import sqlContext.implicits._
trainingData.registerTempTable("trainingdata")
val countResult = sqlContext.sql("SELECT COUNT(*) FROM trainingdata").collect()

The error message is:

java.lang.RuntimeException: Table Not Found: trainingdata at scala.sys.package$.error(package.scala:27) at org.apache.spark.sql.catalyst.analysis.SimpleCatalog.lookupRelation(Catalog.scala:139) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.getTable(Analyzer.scala:257) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$7.applyOrElse(Analyzer.scala:268) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$7.applyOrElse(Analyzer.scala:264) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:57) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:57) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:51) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:56) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:54) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$1.apply(LogicalPlan.scala:54) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:249)

pheeleeppoo
  • 1,491
  • 6
  • 25
  • 29
Ruxi Zhang
  • 323
  • 6
  • 12

2 Answers2

0

Is there a possibility that you have actually not created the trainingData dataframe.

You need to have a statement like the following:

  • If you are reading from a Hive table

    val trainingData = sqlContext.table(s"libname.tablename")
    
  • If you are converting a seq/array to a Dataframe

    val trainingData = Seq((1,2,3,4)).toDF("ce_sp", "ce_sp2", "ce_colour", "ce_sp3")
    

Here a bunch of other ways to convert RDD to DF: How to convert rdd object to dataframe in spark

Community
  • 1
  • 1
Sayon M
  • 146
  • 1
  • 10
0

As per the spark version 2 and above you don't need to import implicits class and you can directly run queries something like below:

val sqlSeason=spark.sql("""

    select distinct a.sku,a.season,a.counter from SEASON_UPDATE2 a, 

""")

sqlSeason.createOrReplaceTempView("SEASON_UPDATE1")

sqlSeason.show()
Sailendra
  • 1,318
  • 14
  • 29