Trending questions in Apache Spark

0 votes
1 answer

Date formats : how to cast string to date?

Try this, it should work: > from pyspark.sql.functions ...READ MORE

Jul 29, 2019 in Apache Spark by Niall
6,027 views
0 votes
1 answer

Spark Null Pointer Exception.

I used Spark 1.5.2 with Hadoop 2.6 ...READ MORE

Jul 19, 2019 in Apache Spark by ravikiran
• 4,620 points
6,451 views
0 votes
1 answer

How to append a list in Scala?

Hey, For this purpose, we use the single ...READ MORE

Jul 26, 2019 in Apache Spark by Gitika
• 65,770 points
5,993 views
0 votes
1 answer

How to compute the square root of sum of squares of numbers?

Hey, You need to follow some steps to complete ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,770 points
5,954 views
+1 vote
1 answer

How do I turn off INFO Logging in Spark?

Hi, You need to edit one property in ...READ MORE

Jul 12, 2019 in Apache Spark by ravikiran
• 4,620 points

edited Dec 20, 2020 by MD 6,308 views
0 votes
1 answer

How to create RDD from existing RDD in scala?

scala> val rdd1 = sc.parallelize(List(1,2,3,4,5))                           -  Creating ...READ MORE

Feb 29, 2020 in Apache Spark by anonymous
1,449 views
0 votes
1 answer

Read multiple xml files in Spark

You can do this using globbing. See ...READ MORE

Jul 25, 2019 in Apache Spark by Jack
5,677 views
0 votes
1 answer

what are the spark job and spark task and spark staging ?

In a Spark application, when you invoke ...READ MORE

Mar 18, 2019 in Apache Spark by Pavan
11,233 views
+1 vote
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,770 points
5,054 views
+1 vote
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 10, 2019 in Apache Spark by Jishnu
6,220 views
+1 vote
2 answers

sparkstream.textfilstreaming(localpathdirectory). I am getting empty results

Hey @c.kothamasu You should copy your file to ...READ MORE

Nov 7, 2019 in Apache Spark by Manas
958 views
0 votes
1 answer

Which File System is supported by Apache Spark?

Hi, Apache Spark is an advanced data processing ...READ MORE

Jul 5, 2019 in Apache Spark by Gitika
• 65,770 points
6,368 views
–1 vote
0 answers
0 votes
1 answer

How to launch spark application in cluster mode in Spark?

Hi, To launch spark application in cluster mode, ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
4,971 views
0 votes
1 answer

error: identified expected but integer literal found.

Hi, You can resolve this error with a ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,770 points
6,243 views
0 votes
1 answer

Spark:error:throws stack overflow when union a lot.

Hey, Use SparkContext.union(...) instead to union many RDDs at once You ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,770 points
5,020 views
0 votes
1 answer

Scala - Error in Inheritance: <console>:: error: not found: value

You need to declare the variable which ...READ MORE

Aug 1, 2019 in Apache Spark by Karan
4,890 views
0 votes
1 answer

PySpark not starting: No active sparkcontext

Seems like Spark hadoop daemons are not ...READ MORE

Jul 30, 2019 in Apache Spark by Jishan
4,805 views
+1 vote
1 answer

How to convert JSON file to AVRO file and vise versa

Try including the package while starting the ...READ MORE

Aug 26, 2019 in Apache Spark by Karan
3,609 views
0 votes
1 answer

Cache() vs persist() in Spark

The cache() is used only the default storage level ...READ MORE

Mar 8, 2019 in Apache Spark by Raj
10,969 views
0 votes
1 answer

How to check if a particular keyword exists in Apache Spark?

Hey, You can try this code to get ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,770 points
4,956 views
0 votes
1 answer

Spark Error: StackOverflowError : Exception in thread "main" java.lang.StackOverflowError at org.apache.spark.rdd.UnionRDD$$anonfun$1.apply

Hey, It already has SparkContent.union and it does know how to ...READ MORE

Jul 31, 2019 in Apache Spark by Gitika
• 65,770 points
4,468 views
+1 vote
1 answer

What is reduce() action in Spark?

Hey, It takes a function that operates on two ...READ MORE

Jul 2, 2019 in Apache Spark by Gitika
• 65,770 points
5,629 views
0 votes
1 answer

What is ofDim in Scala?

Hey, ofDim() is a method in Scala that ...READ MORE

Jul 24, 2019 in Apache Spark by Gitika
• 65,770 points
4,691 views
0 votes
1 answer

error: identifier expected but ']' found.

Hi, You can try this remove brackets from ...READ MORE

Jul 3, 2019 in Apache Spark by Gitika
• 65,770 points
5,565 views
+1 vote
2 answers

What is sparkContext?

SparkContext sets up internal services and establishes ...READ MORE

Dec 5, 2019 in Apache Spark by anonymous
2,094 views
0 votes
1 answer

How to find the number of elements present in the array in a Spark DataFame column?

You can select the column and apply ...READ MORE

Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points
22,457 views
0 votes
1 answer

Spark-shell not working

First, reboot the system. And after reboot, ...READ MORE

Jul 15, 2019 in Apache Spark by Mahesh
4,940 views
0 votes
1 answer

Load .xlsx files to hive tables with spark scala

This should work: def readExcel(file: String): DataFrame = ...READ MORE

Jul 22, 2019 in Apache Spark by Kishan
4,606 views
0 votes
1 answer

How do find Max and Min values in a set in Scala?

Hey, Here is the example of which will return ...READ MORE

Jul 30, 2019 in Apache Spark by Gitika
• 65,770 points
4,164 views
+1 vote
2 answers

How can I convert Spark Dataframe to Spark RDD?

Assuming your RDD[row] is called rdd, you ...READ MORE

Jul 9, 2018 in Apache Spark by zombie
• 3,790 points
20,737 views
0 votes
1 answer

How to remove the elements with a key present in any other RDD?

Hey, You can use the subtractByKey () function to ...READ MORE

Jul 22, 2019 in Apache Spark by Gitika
• 65,770 points
4,231 views
0 votes
1 answer

Spark: How can i create temp views in user defined database instead of default database?

You can try the below code: df.registerTempTable(“airports”) sqlContext.sql(" create ...READ MORE

Jul 14, 2019 in Apache Spark by Ishan
4,528 views
+1 vote
1 answer

Scala: Convert text file data into ORC format using data frame

Converting text file to Orc: Using Spark, the ...READ MORE

Aug 1, 2019 in Apache Spark by Esha
3,647 views
0 votes
1 answer

How to increase worker timeout in Spark application?

By default, the timeout is set to ...READ MORE

Mar 25, 2019 in Apache Spark by Hari
9,121 views
0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

Jul 24, 2019 in Apache Spark by Yogi
3,891 views
0 votes
1 answer

Spark Installation problem

After downloading Spark, you need to set ...READ MORE

Jul 5, 2019 in Apache Spark by Rishi
4,709 views
+1 vote
1 answer

How to extract record from one RDD using another RDD

Hey, you can use "contains" filter to extract ...READ MORE

Aug 23, 2019 in Apache Spark by Karan
2,448 views
0 votes
1 answer

What is Spark UI and how to monitor a spark job?

Hey, Jobs- to view all the spark jobs Stages- ...READ MORE

Aug 6, 2019 in Apache Spark by Gitika
• 65,770 points
3,128 views
0 votes
1 answer

Scala: 30: error: value partitions is not a member of String

Try this code: val rdd= sc.textFile (“file.txt”, 5) rdd.partitions.size Output ...READ MORE

Jul 29, 2019 in Apache Spark by Nijit
3,449 views
+1 vote
1 answer

Error: value textfile is not a member of org.apache.spark.SparkContext

Hi, Regarding this error, you just need to change ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,770 points
4,417 views
0 votes
1 answer

How to save RDD in Apache Spark?

Hey, There are few methods provided by the ...READ MORE

Jul 23, 2019 in Apache Spark by Gitika
• 65,770 points
3,650 views
0 votes
1 answer

How to add package com.databricks.spark.avro in spark?

Start spark shell using below line of ...READ MORE

Jul 23, 2019 in Apache Spark by Ritu
3,589 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

Sep 11, 2019 in Apache Spark by ravikiran
• 4,620 points
1,385 views
+1 vote
1 answer

Need to load 40 GB data to elasticsearch using spark

Did you find any documents or example ...READ MORE

Nov 5, 2019 in Apache Spark by Begum
1,335 views
0 votes
1 answer

What is RDD Lineage in Spark?

Hey, Lineage is an RDD process to reconstruct ...READ MORE

Jul 4, 2019 in Apache Spark by Gitika
• 65,770 points
4,319 views
0 votes
1 answer

load/save text file in spark

The reason you are able to load ...READ MORE

Jul 22, 2019 in Apache Spark by Giri
3,491 views
0 votes
1 answer

Scala: error: value unary_+ is not a member of (Int, Int)

All prefix operators' symbols are predefined: +, -, ...READ MORE

Jul 22, 2019 in Apache Spark by karan
3,455 views
+1 vote
1 answer

By default how many partitions are created in RDD in Apache spark?

Well, it depends on the block of ...READ MORE

Aug 2, 2019 in Apache Spark by Gitika
• 65,770 points
2,856 views