File not found exception while processing the spark job in yarn cluster mode with multinode hadoop cluster

0 votes

Application application_1595939708277_0012 failed 2 times due to AM Container for appattempt_1595939708277_0012_000002 exited with exitCode: -1000

For more detailed output, check the application tracking page:http://JDESXSRV14S52:8088/cluster/app/application_1595939708277_0012Then, click on links to logs of each attempt.

Diagnostics: File file:/home/hadoop/.sparkStaging/application_1595939708277_0012/ms-fs-4.1.0.0-release.jar does not exist

java.io.FileNotFoundException: File file:/home/hadoop/.sparkStaging/application_1595939708277_0012/ms-fs-4.1.0.0-release.jar does not exist
at org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824)
at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601)
at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421)
at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:358)
at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Failing this attempt. Failing the application.
Jul 30, 2020 in Apache Spark by Ganendra
• 140 points

recategorized Jul 30, 2020 by MD 4,592 views

1 answer to this question.

0 votes

Hi@Ganendra,

I am not sure what's the issue, you probably need to do a bit of troubleshooting. You can check if the jar file mentioned above exists? If no, maybe you can try manually download them from the internet. If yes, check the owner and permission mask of the directories. Also, try to check the configuration files. As one of the common mistake is to set the HADOOP_CONF_DIR path.

Add HADOOP_CONF_DIR=/opt/hadoop-2.7.3/etc/hadoop/ to ./conf/spark-env.sh

Check your core-site.xml file with the below entries.

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://masterip:port</value>
</property>
</configuration>

Also, remove the setMaster('local') in the source file.

answered Jul 30, 2020 by MD
• 95,460 points

Thanks for your reply.

The jar specified is an external jar which I am passing using 

spark-submit deploy-mode cluster --master yarn --jars hdfs://ms_XX.jar --class XXX XX.jar 

The above-specified configurations are done as an initial step. Still, the issue persists.

The jar is getting downloaded into local file system under

/home/hadoop/.sparkStaging/application_1595939708277_0012 but still showing as jar does not exists.

Please provide your valuable inputs.

Thanks & Regards,

Gani

I think in your command, you are using the HDFS path for the jar file. But the jar file present in your local system. So you need to upload this file in HDFS first.

Related Questions In Apache Spark

0 votes
1 answer

Copy file from local to hdfs from the spark job in yarn mode

Refer to the below code: import org.apache.hadoop.conf.Configuration import org.apache.hadoop.fs.FileSystem import ...READ MORE

answered Jul 24, 2019 in Apache Spark by Yogi
3,823 views
+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,490 points
8,453 views
0 votes
1 answer

In what kind of use cases has Spark outperformed Hadoop in processing?

I can list some but there can ...READ MORE

answered Sep 19, 2018 in Apache Spark by zombie
• 3,790 points
1,127 views
0 votes
1 answer

where can i get spark-terasort.jar and not .scala file, to do spark terasort in windows.

Hi! I found 2 links on github where ...READ MORE

answered Feb 13, 2019 in Apache Spark by Omkar
• 69,220 points
1,340 views
+1 vote
2 answers
0 votes
1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 3, 2019 in Big Data Hadoop by ravikiran
• 4,620 points
1,247 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
1,871 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
11,027 views
0 votes
1 answer

Unable to submit the spark job in deployment mode - multinode cluster(using ubuntu machines) with yarn master

Hi@Ganendra, As you said you launched a multinode cluster, ...READ MORE

answered Jul 29, 2020 in Apache Spark by MD
• 95,460 points
2,274 views
+2 votes
2 answers
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP