Reading\Writing Different file format in HDFS by using pyspark

Issue – How to read\write different file format in HDFS by using pyspark File Format Action Procedure example without compression text File Read sc.textFile() orders = sc.textFile(“/user/BDD/navnit/data-master/retail_db/orders”) Write rdd.saveAsTextFile() orders.saveAsTextFile(“/user/BDD/navnit/saveTextFile/orders”) sequence File Read sc.sequenceFile(ordersSF = sc.sequenceFile(‘/user/BDD/navnit/saveSequenceFile/orders’) Write PipelinedRDD.saveAsSequenceFile() ordersKV.saveAsSequenceFile(‘/user/BDD/navnit/saveSequenceFile/orders’) Avro file Read sqlContext.read.format(“com.databricks.spark.avro”).load() orders = sqlContext.read.format(“com.databricks.spark.avro”).load(“/home/BDD/navnit/orders/”) Write dataFram.write.format(“com.databricks.spark.avro”).save() orders.write.format(“com.databricks.spark.avro”).save(“/user/BDD/navnit/saveAvroFile/orders”) Parquet File Read sqlContext.read.parquet() ordersParquet =… Read More Reading\Writing Different file format in HDFS by using pyspark