Scala save filtered data row by row using saveAsTextFile

0 votes

Hello Team,

I want to save filtered data row by row using saveAsTextFile. Kindly help.

I tried with flatmap but it's flattening every column.

val rdd1=sc.textFile("/user/edureka_40114/AppleStore.csv")

val rdd3=apple.map(x=>x.split(",")).filter(x=>x(12).equals("\"Games\"")).map(x=>(x(0),x(1)))

rdd3.collect()

I want the output to be saved in hdfs file as below but using flatmap it's giving everything in a separate line.

"1","281656475"

"6","283619399"

"10","284736660"
Aug 2, 2019 in Apache Spark by Hari
1,765 views

1 answer to this question.

0 votes

Try this code, it worked for me:

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

import sqlContext.implicits._​

val data = sqlContext.read.format("csv").option("header", "false").load("/user/edureka_425640/AppleStore.csv")​

data.write.format("csv").save("mobil_out")​


Hope this helps!

If you need to know more about Scala, join Apache Spark course today and become the expert.

Thanks!!

answered Aug 2, 2019 by Karan

Related Questions In Apache Spark

+1 vote
1 answer

Scala: CSV file to Save data into HBase

Check the reference code mentioned below: def main(args: ...READ MORE

answered Jul 25, 2019 in Apache Spark by Hari
1,511 views
0 votes
1 answer

Scala pass input data as arguments

Please refer to the below code as ...READ MORE

answered Jun 19, 2019 in Apache Spark by Lisa
2,464 views
0 votes
1 answer

How can we iterate any function using "foreach" function in scala?

Hi, Yes, "foreach" function you use because it will ...READ MORE

answered Jul 5, 2019 in Apache Spark by Gitika
• 65,770 points
1,610 views
+1 vote
1 answer

Need to load 40 GB data to elasticsearch using spark

Did you find any documents or example ...READ MORE

answered Nov 5, 2019 in Apache Spark by Begum
1,326 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,057 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,559 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
108,975 views
0 votes
1 answer

Spark comparing two big data files using scala

Try this and see if this does ...READ MORE

answered Apr 2, 2019 in Apache Spark by Omkar
• 69,220 points
7,221 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

answered Aug 1, 2019 in Apache Spark by Esha
3,631 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP