Trending questions in Apache Spark

0 votes
1 answer

Can anyone explain the sparse vector in Spark?

Hey, A sparse vector is used for storing ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
5,821 views
0 votes
1 answer

Spark Null Pointer Exception.

I used Spark 1.5.2 with Hadoop 2.6 ...READ MORE

Jul 19, 2019 in Apache Spark by ravikiran
• 4,620 points
6,341 views
0 votes
1 answer

How to append a list in Scala?

Hey, For this purpose, we use the single ...READ MORE

Jul 26, 2019 in Apache Spark by Gitika
• 65,770 points
5,928 views
0 votes
1 answer

How to compute the square root of sum of squares of numbers?

Hey, You need to follow some steps to complete ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,770 points
5,883 views
+1 vote
1 answer

How do I turn off INFO Logging in Spark?

Hi, You need to edit one property in ...READ MORE

Jul 12, 2019 in Apache Spark by ravikiran
• 4,620 points

edited Dec 20, 2020 by MD 6,251 views
0 votes
1 answer

How to create RDD from existing RDD in scala?

scala> val rdd1 = sc.parallelize(List(1,2,3,4,5))                           -  Creating ...READ MORE

Feb 29, 2020 in Apache Spark by anonymous
1,423 views
0 votes
1 answer

Read multiple xml files in Spark

You can do this using globbing. See ...READ MORE

Jul 25, 2019 in Apache Spark by Jack
5,645 views
+1 vote
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,770 points
5,021 views
0 votes
1 answer

what are the spark job and spark task and spark staging ?

In a Spark application, when you invoke ...READ MORE

Mar 18, 2019 in Apache Spark by Pavan
11,160 views
+1 vote
2 answers

sparkstream.textfilstreaming(localpathdirectory). I am getting empty results

Hey @c.kothamasu You should copy your file to ...READ MORE

Nov 7, 2019 in Apache Spark by Manas
920 views
+1 vote
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 10, 2019 in Apache Spark by Jishnu
6,116 views
0 votes
1 answer

Which File System is supported by Apache Spark?

Hi, Apache Spark is an advanced data processing ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,770 points
6,317 views
–1 vote
0 answers
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
4,947 views
0 votes
1 answer

Spark:error:throws stack overflow when union a lot.

Hey, Use SparkContext.union(...) instead to union many RDDs at once You ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,770 points
4,966 views
0 votes
1 answer

error: identified expected but integer literal found.

Hi, You can resolve this error with a ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,770 points
6,140 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1, 2019 in Apache Spark by Karan
4,842 views
+1 vote
1 answer

How to convert JSON file to AVRO file and vise versa

Try including the package while starting the ...READ MORE

Aug 26, 2019 in Apache Spark by Karan
3,572 views
0 votes
1 answer

Cache() vs persist() in Spark

The cache() is used only the default storage level ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
10,938 views
0 votes
1 answer

PySpark not starting: No active sparkcontext

Seems like Spark hadoop daemons are not ...READ MORE

Jul 30, 2019 in Apache Spark by Jishan
4,711 views
0 votes
1 answer

How to check if a particular keyword exists in Apache Spark?

Hey, You can try this code to get ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,770 points
4,871 views
0 votes
1 answer

Spark Error: StackOverflowError : Exception in thread "main" java.lang.StackOverflowError at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply

Hey, It already has SparkContent.union and it does know how to ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,770 points
4,407 views
+1 vote
1 answer

What is reduce() action in Spark?

Hey, It takes a function that operates on two ...READ MORE

Jul 2, 2019 in Apache Spark by Gitika
• 65,770 points
5,570 views
0 votes
1 answer

What is ofDim in Scala?

Hey, ofDim() is a method in Scala that ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,770 points
4,639 views
0 votes
1 answer

error: identifier expected but ']' found.

Hi, You can try this remove brackets from ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,770 points
5,531 views
+1 vote
2 answers

What is sparkContext?

SparkContext sets up internal services and establishes ...READ MORE

Dec 5, 2019 in Apache Spark by anonymous
2,053 views
0 votes
1 answer

How to find the number of elements present in the array in a Spark DataFame column?

You can select the column and apply ...READ MORE

Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points
22,394 views
0 votes
1 answer

Spark-shell not working

First, reboot the system. And after reboot, ...READ MORE

Jul 15, 2019 in Apache Spark by Mahesh
4,875 views
0 votes
1 answer

Load .xlsx files to hive tables with spark scala

This should work: def readExcel(file: String): DataFrame = ...READ MORE

Jul 22, 2019 in Apache Spark by Kishan
4,529 views
0 votes
1 answer

How do find Max and Min values in a set in Scala?

Hey, Here is the example of which will return ...READ MORE

Jul 30, 2019 in Apache Spark by Gitika
• 65,770 points
4,136 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
20,642 views
0 votes
1 answer

How to remove the elements with a key present in any other RDD?

Hey, You can use the subtractByKey () function to ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,770 points
4,168 views
0 votes
1 answer

Spark: How can i create temp views in user defined database instead of default database?

You can try the below code: df.registerTempTable(“airports”) sqlContext.sql(" create ...READ MORE

Jul 14, 2019 in Apache Spark by Ishan
4,487 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1, 2019 in Apache Spark by Esha
3,619 views
0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

Jul 24, 2019 in Apache Spark by Yogi
3,823 views
0 votes
1 answer

Spark Installation problem

After downloading Spark, you need to set ...READ MORE

Jul 5, 2019 in Apache Spark by Rishi
4,632 views
0 votes
1 answer

How to increase worker timeout in Spark application?

By default, the timeout is set to ...READ MORE

Mar 25, 2019 in Apache Spark by Hari
9,024 views
+1 vote
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23, 2019 in Apache Spark by Karan
2,400 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,770 points
3,094 views
0 votes
1 answer

Scala: 30: error: value partitions is not a member of String

Try this code: val rdd= sc.textFile (“file.txt”, 5) rdd.partitions.size Output ...READ MORE

Jul 29, 2019 in Apache Spark by Nijit
3,382 views
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,770 points
3,616 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,770 points
4,357 views
0 votes
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 23, 2019 in Apache Spark by Ritu
3,569 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

Sep 11, 2019 in Apache Spark by ravikiran
• 4,620 points
1,357 views
+1 vote
1 answer

Need to load 40 GB data to elasticsearch using spark

Did you find any documents or example ...READ MORE

Nov 5, 2019 in Apache Spark by Begum
1,317 views
0 votes
1 answer

What is RDD Lineage in Spark?

Hey, Lineage is an RDD process to reconstruct ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,770 points
4,289 views
0 votes
1 answer

load/save text file in spark

The reason you are able to load ...READ MORE

Jul 22, 2019 in Apache Spark by Giri
3,443 views
0 votes
1 answer

Scala: error: value unary_+ is not a member of (Int, Int)

All prefix operators' symbols are predefined: +, -, ...READ MORE

Jul 22, 2019 in Apache Spark by karan
3,419 views
+1 vote
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
2,822 views