How can I import zip files and process the excel files ( inside the zip files ) by using pyspark connecting with pymongo?
I was install spark and mongodb and python to process the files (excel, csv or json)
I used this code to connect pyspark with mmongo :
from pyspark.sql import SparkSession
my_spark = SparkSession \
.builder \
.appName("myApp") \
.config("spark.mongodb.input.uri", "mongodb://127.0.0.1/test.coll") \
.config("spark.mongodb.output.uri", "mongodb://127.0.0.1/test.coll") \
.getOrCreate()
but then I was try to import zip files ( I don't need to open every files to process it )
To know more about Pyspark, it's recommended that you join Pyspark Course today.