Questions tagged [apache-spark-encoders]
54 questions
                    
                    165
                    
            votes
                
                9 answers
            
        How to store custom objects in Dataset?
According to Introducing Spark Datasets:
As we look forward to Spark 2.0, we plan some exciting improvements to Datasets, specifically:
  ...
  Custom encoders – while we currently autogenerate encoders for a wide variety of types, we’d like to…
        
        zero323
        
- 322,348
 - 103
 - 959
 - 935
 
                    66
                    
            votes
                
                3 answers
            
        Why is "Unable to find encoder for type stored in a Dataset" when creating a dataset of custom case class?
Spark 2.0 (final) with Scala 2.11.8. The following super simple code yields the compilation error Error:(17, 45) Unable to find encoder for type stored in a Dataset.  Primitive types (Int, String, etc) and Product types (case classes) are supported…
        
        clay
        
- 18,138
 - 28
 - 107
 - 192
 
                    43
                    
            votes
                
                4 answers
            
        Encoder error while trying to map dataframe row to updated row
When I m trying to do the same thing in my code  as mentioned below
dataframe.map(row => {
  val row1 = row.getAs[String](1)
  val make = if (row1.toLowerCase == "tesla") "S" else row1
  Row(row(0),make,row(2))
})
I have taken the above reference…
        
        Advika
        
- 585
 - 1
 - 5
 - 13
 
                    35
                    
            votes
                
                2 answers
            
        Encoder for Row Type Spark Datasets
I would like to write an encoder for a Row type in DataSet, for a map operation that I am doing. Essentially, I do not understand how to write encoders.
Below is an example of a map operation:
In the example below, instead of returning…
        
        tsar2512
        
- 2,826
 - 3
 - 33
 - 61
 
                    28
                    
            votes
                
                2 answers
            
        How to convert a dataframe to dataset in Apache Spark in Scala?
I need to convert my dataframe to a dataset and I used the following code:
    val final_df = Dataframe.withColumn(
      "features",
      toVec4(
        // casting into Timestamp to parse the string, and then into Int
       …
        user8131063
                    23
                    
            votes
                
                3 answers
            
        How to create a custom Encoder in Spark 2.X Datasets?
Spark Datasets move away from Row's to Encoder's for Pojo's/primitives. The Catalyst engine uses an ExpressionEncoder to convert columns in a SQL expression.  However there do not appear to be other subclasses of Encoder available to use as a…
        
        WestCoastProjects
        
- 58,982
 - 91
 - 316
 - 560
 
                    18
                    
            votes
                
                3 answers
            
        Why is the error "Unable to find encoder for type stored in a Dataset" when encoding JSON using case classes?
I've written spark job:
object SimpleApp {
  def main(args: Array[String]) {
    val conf = new SparkConf().setAppName("Simple Application").setMaster("local")
    val sc = new SparkContext(conf)
    val ctx = new…
        
        Milad Khajavi
        
- 2,769
 - 9
 - 41
 - 66
 
                    14
                    
            votes
                
                2 answers
            
        Encode an ADT / sealed trait hierarchy into Spark DataSet column
If I want to store an Algebraic Data Type (ADT) (ie a Scala sealed trait hierarchy) within a Spark DataSet column, what is the best encoding strategy?
For example, if I have an ADT where the leaf types store different kinds of data:
sealed trait…
        
        Ben Hutchison
        
- 2,433
 - 2
 - 21
 - 25
 
                    12
                    
            votes
                
                1 answer
            
        Apache Spark 2.0: java.lang.UnsupportedOperationException: No Encoder found for java.time.LocalDate
I am using Apache Spark 2.0 and creating case class for mention schema for DetaSet. When i am trying to define custom encoder according to How to store custom objects in Dataset?, for java.time.LocalDate i got following exception:…
        
        Harmeet Singh Taara
        
- 6,483
 - 20
 - 73
 - 126
 
                    10
                    
            votes
                
                3 answers
            
        Convert scala list to DataFrame or DataSet
I am new to Scala. I am trying to convert a scala list (which is holding the results of some calculated data on a source DataFrame)  to Dataframe or Dataset. I am not finding any direct method to do that.
However, I have tried the following process…
        
        Leo
        
- 315
 - 1
 - 3
 - 17
 
                    8
                    
            votes
                
                1 answer
            
        Spark Dataset : Example : Unable to generate an encoder issue
New to spark world and trying a dataset example written in scala that I found online 
On running it through SBT , i keep on getting the following error 
org.apache.spark.sql.AnalysisException: Unable to generate an encoder for inner class 
Any idea…
        user5131511
                    8
                    
            votes
                
                1 answer
            
        Spark Dataset and java.sql.Date
Let's say I have a Spark Dataset like this:
scala> import java.sql.Date
scala> case class Event(id: Int, date: Date, name: String)
scala> val ds = Seq(Event(1, Date.valueOf("2016-08-01"), "ev1"), Event(2, Date.valueOf("2018-08-02"), "ev2")).toDS
I…
        
        Lukáš Lalinský
        
- 40,587
 - 6
 - 104
 - 126
 
                    6
                    
            votes
                
                2 answers
            
        Rename columns in spark using @JsonProperty while creating Datasets
Is there way to rename the column names in dataset using Jackson annotations while creating a Dataset?
My encoder class is as follows:
import com.fasterxml.jackson.annotation.JsonProperty;
import lombok.*;
import scala.Serializable;
import…
        
        Arjav96
        
- 79
 - 3
 
                    6
                    
            votes
                
                5 answers
            
        How to map rows to protobuf-generated class?
I need to write a job that reads a DataSet[Row] and converts it to a DataSet[CustomClass]
where CustomClass is a protobuf class.
val protoEncoder = Encoders.bean(classOf[CustomClass])
val transformedRows = rows.map {
  case Row(f1: String, f2: Long…
        
        Apurva
        
- 153
 - 2
 - 7
 
                    6
                    
            votes
                
                1 answer
            
        scala generic encoder for spark case class
How can I get this method to compile. Strangely, sparks implicit are already imported.
def loadDsFromHive[T <: Product](tableName: String, spark: SparkSession): Dataset[T] = {
    import spark.implicits._
    spark.sql(s"SELECT * FROM…
        
        Georg Heiler
        
- 16,916
 - 36
 - 162
 - 292