Spark Core How to fetch max n rows of an RDD function without using Rdd max

I have an RDD having below elements:
('09', [25, 66, 67])
('17', [66, 67, 39])
('04', [25])
('08', [120, 122])
('28', [25, 67])
('30', [122])

I need to fetch the elements having max number of elements in the list which is 3 in the above RDD
O/p should be filtered into another RDD and not use the max function and **spark dataframes**:
('09', [25, 66, 67])
('17', [66, 67, 39])

max_len = uniqueRDD.max(lambda x: len(x[1]))
maxRDD = uniqueRDD.filter(lambda x : (len(x[1]) == len(max_len[1])))

I am able to do with above lines of code but spark streaming won't support this as max_len is a tuple and not RDD

Can someone suggest? Thanks in advance,

Dec 3, 2020 in Apache Spark by Prashant
• 120 points • 2,346 views

1 answer to this question.

Hi@Prasant,

If Spark Streaming is not supporting tuple, then you need to convert the tuple to RDD.

answered Dec 3, 2020 by MD
• 95,460 points

Related Questions In Apache Spark

0 votes

1 answer

How to create paired RDD using subString method in Spark?

Hi, If you have a file with id ...READ MORE

answered Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points • 2,838 views

–1 vote

0 answers

How to parse an S3 XML file to find tags using apache spark

How can one parse an S3 XML ...READ MORE

Mar 18, 2020 in Apache Spark by anonymous
• 110 points • 2,167 views

0 votes

1 answer

How can I compare the elements of the RDD using MapReduce?

You have to use the comparison operator ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 13,490 points • 3,576 views

0 votes

1 answer

How to find max value in pair RDD?

Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE

answered May 26, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 8,142 views

+1 vote

2 answers

How do I get number of columns in each line from a delimited file??

Instead of spliting on '\n'. You should ...READ MORE

answered Aug 7, 2019 in Apache Spark by ashish
• 5,874 views

0 votes

1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 3, 2019 in Big Data Hadoop by ravikiran
• 4,620 points • 1,405 views

0 votes

1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
• 2,095 views

+1 vote

1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points • 11,328 views

+1 vote

8 answers

How to print the contents of RDD in Apache Spark?

Save it to a text file: line.saveAsTextFile("alicia.txt") Print contains ...READ MORE

answered Dec 10, 2018 in Apache Spark by Akshay
• 62,276 views

+2 votes

14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 5, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar • 89,580 views

Subscribe to our Newsletter, and get personalized recommendations.

REGISTER FOR FREE WEBINAR

Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP