Spark read csv to create RDD into Dataframe

0 votes

 I am trying to parse the case class file as RDD and convert the RDD into DF. But I couldn't do it. I am trying to use spark.read.csv. Please help.

Jan 22, 2019 in Big Data Hadoop by slayer
• 29,370 points
5,848 views

1 answer to this question.

0 votes

You can use a case class and rdd and then convert it to dataframe. 

The common syntax to create a dataframe directly from a file is as shown below for your reference.

val df = spark.read.option("header","true").option(inferSchema,"true").csv("") 

if you are relying on in-built schema of the csv file.

And If you don't then you can create a schema such as:

val schema = StructType(Array(StructField("AirportID", IntegerType, true)))
answered Jan 22, 2019 by Omkar
• 69,220 points

Related Questions In Big Data Hadoop

0 votes
1 answer

Spark - load CSV file as DataFrame?

spark-csv is part of core Spark functionality ...READ MORE

answered Sep 25, 2018 in Big Data Hadoop by slayer
• 29,370 points
6,845 views
0 votes
1 answer

How to read more than one files in Apache Spark?

Try this: val text = sc.wholeTextFiles("student/*") text.collect() ...READ MORE

answered Dec 11, 2018 in Big Data Hadoop by Omkar
• 69,220 points
2,606 views
0 votes
1 answer

How to read Spark elements having multiple lines each?

Try this: val new_records = sc.newAPIHadoopRDD(hadoopConf,classOf[ ...READ MORE

answered Dec 12, 2018 in Big Data Hadoop by Omkar
• 69,220 points
1,398 views
0 votes
1 answer

How to create a FileSystem object that can be used for reading from and writing to HDFS?

Read operation on HDFS In order to read ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points

edited Mar 22, 2018 by nitinrawat895 3,028 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,015 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,528 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
108,740 views
0 votes
1 answer

How to convert Spark data into CSV?

You can use this: df.write .option("header", "true") ...READ MORE

answered Nov 21, 2018 in Big Data Hadoop by Omkar
• 69,220 points
4,476 views
0 votes
1 answer

How to save Spark dataframe as dynamic partitioned table in Hive?

Hey, you can try something like this: df.write.partitionBy('year', ...READ MORE

answered Nov 6, 2018 in Big Data Hadoop by Omkar
• 69,220 points
8,512 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP