Quantcast
Channel: Recent Questions - Stack Overflow
Viewing all articles
Browse latest Browse all 12201

DataSet encoding of map and sequence of LocalDate problems

$
0
0

I'm writing a solution in Scala based on Spark 3.5 that has complex map/seq data structures of dates where I need to represent missing dates with null or None. As this is scala my preference is None, however, I get an exception trying to encode Option[LocalDate] in a sequence or map. As a workaround, I switched to Null but I can't do this in a map as Null can't be used as a key.

This is code that reproduces the problem:

case class SubClass[T](i: Option[T])case class TestClass[T](i: Option[T] = None, st: Option[SubClass[T]], sq: Option[Seq[Option[T]]] = None)  it should "handle sequence of date" in {    val sparkSession = spark    import sparkSession.implicits._    val aDate = LocalDate.parse("2024-04-29")    var ds = Seq(TestClass[LocalDate](Some(aDate), Some(SubClass[LocalDate](Some(aDate))), Some(Seq(Some(aDate))))).toDS        ds.show  }  it should "handle date structure" in {    val sparkSession = spark    import sparkSession.implicits._    val aDate = LocalDate.parse("2024-04-29")    var ds = Seq(TestClass[LocalDate](Some(aDate), Some(SubClass[LocalDate](Some(aDate))))).toDS        ds.show  }  it should "handle sequence of double" in {    val sparkSession = spark    import sparkSession.implicits._    val aNum = 2.42    var ds = Seq(TestClass[Double](Some(aNum), Some(SubClass[Double](Some(aNum))), Some(Seq(Some(aNum))))).toDS        ds.show  }  it should "handle sequence of string" in {    val sparkSession = spark    import sparkSession.implicits._    val aString = "Foo"    var ds = Seq(TestClass[String](Some(aString), Some(SubClass[String](Some(aString))), Some(Seq(Some(aString))))).toDS        ds.show  }

For the above tests, they all work except should handle sequence of date which fails with exception org.apache.spark.SparkRuntimeException: Error while encoding: java.lang.RuntimeException: scala.Some is not a valid external type for schema of date.

This only seems to be an issue with LocalDate as the Double and String tests work just fine.

Can anyone suggest a workaround or fix for this?

Thanks,

David


Viewing all articles
Browse latest Browse all 12201

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>