Mastering PySpark — File formats

Amit Singh Rathore
Dev Genius
Published in
2 min readMay 3, 2024

--

Cheatsheet to work with different file formats in spark.

Native format does not need any extra jar to be installed. External formats needs separate jar for spark to be able to read the files. Let us see some examples of how we can read these formats in spark.

CSV

Reading CSV:

df_csv = spark.read.csv('path/to/your/csvfile.csv'…

--

--