val df1= sqlContext.read.json("file:///home/edureka/Desktop/datsets/world_bank.json") // loads file, give appropriate path
This sql context is used in version below 2.0 of spark , but this also will work.
Using spark session fro 2.0 and above use below line
val df1 = spark.read.json("file:///home/edureka/Desktop/datsets/world_bank.json"
df1.printSchema() // provides the schema details
data:image/s3,"s3://crabby-images/fd402/fd4021d2b5b1589c78f55a444e0dd455bf5485f7" alt="image"
df1select("id","countrycode").show
output table: shown below
data:image/s3,"s3://crabby-images/07c77/07c77fd2ba832870d05dcbab8809efde361aa952" alt="image"
Like this if you have - numeric or integer data - use data frame api ' s we can filter the data required as you have mentioned,
or Create a temporary view using the DataFrame
// Creates a temporary view using the DataFrame
df1.createOrReplaceTempView("table1")
Then use sql statements to query , if in case age field is in table - for example
val age = spark.sql("SELECT name FROM table1 WHERE age > 25 ")
age. show()
To save this dataframe as for example csv format use below statement.
age.write.format("csv").save("file path/..././namesAndAges.csv")
Hope this helps!
If you need to know more about Scala, join Apache Spark course today and become the expert.
Thanks!!