Quantcast
Channel: Recent Questions - Stack Overflow
Viewing all articles
Browse latest Browse all 12111

How to read the input json using a schema file and populate default value if column not being found in scala?

$
0
0

Input Dataframe

val input_json="""[{"orderid":"111","customers":{"customerId":"123"},"Offers":[{"Offerid":"1"},{"Offerid":"2"}]}]""";val inputdataRdd = spark.sparkContext.parallelize(input_json :: Nil);val inputdataRdddf = spark.read.json(inputdataRdd);inputdataRdddf.show();

schema df

 val schema_json="""[{"orders":{"order_id":{"path":"orderid","type":"string","nullable":false},"customer_id":{"path":"customers.customerId","type":"int","nullable":false,"default_value":"null"},"offer_id":{"path":"Offers.Offerid","type":"string","nullable":false},"eligible":{"path":"eligible.eligiblestatus","type":"string","nullable":true,"default_value":"not eligible"}},"products":{"product_id":{"path":"product_id","type":"string","nullable":false},"product_name":{"path":"products.productname","type":"string","nullable":false}}}]""";val schemaRdd = spark.sparkContext.parallelize(schema_json :: Nil);val schemaRdddf = spark.read.json(schemaRdd);schemaRdddf.show();

enter image description here

using the schema df , i want to read all the columns from the input dataframe.

  1. if the nullable key is true then i want to populate the column with default value (if the column is not present or not having any data).In the above example, eligible.eligiblestatus is not present hence i want to populate with some default value
  2. Also i want to change the data type of the columns based in type key defined in the schema json. . e.g customer id is of type INT in schema json but in input dataframe it is coming as string, hence i want to cast it to integer.
  3. the final column name should be taken from the key from schema json. e.g order_id is the key for orderid attribute

Final DF should have columns like:

order_id:String,customer_id:int, offer_id: string(array type cast to string),eligiblestatus:string

enter image description here


Viewing all articles
Browse latest Browse all 12111

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>