Job failed as tasks failed failedMaps

Hello All,

I am new to hadoop, i need help with one issue which i am facing. There was one requirement to add new columns in table after the changes when i am trying to test the changes. The M/R job trying 2-3 times and it is getting failed, so SQOOP export is also getting failed. When i see logs i didn't have any specific exception. I am sharing the log details, if any have idea please do let me know and thanks in adv

19/07/30 04:06:18 INFO tool.CodeGenTool: Beginning code generation

19/07/30 04:06:18 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM [staging].[SR] AS t WHERE 1=0

19/07/30 04:06:19 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/mapr/hadoop/hadoop-2.7.0

Note: /tmp/sqoop-/SR.java uses or overrides a deprecated API.

Note: Recompile with -Xlint:deprecation for details.

19/07/30 04:06:23 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-/SR.jar

19/07/30 04:06:28 INFO mapreduce.ExportJobBase: Beginning export of SR

19/07/30 04:06:28 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar

19/07/30 04:06:28 INFO mapreduce.JobBase: Setting default value for hadoop.job.history.user.location=none

19/07/30 04:06:29 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative

19/07/30 04:06:29 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

19/07/30 04:06:29 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps

19/07/30 04:06:29 INFO client.MapRZKBasedRMFailoverProxyProvider: Updated RM address to 

19/07/30 04:06:31 INFO input.FileInputFormat: Total input paths to process : 1

19/07/30 04:06:31 INFO input.FileInputFormat: Total input paths to process : 1

19/07/30 04:06:31 INFO mapreduce.JobSubmitter: number of splits:1

19/07/30 04:06:31 INFO Configuration.deprecation: mapred.map.tasks.speculative.execution is deprecated. Instead, use mapreduce.map.speculative

19/07/30 04:06:31 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative

19/07/30 04:06:31 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps

19/07/30 04:06:31 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1563651888010_477141

19/07/30 04:06:32 INFO security.ExternalTokenManagerFactory: Initialized external token manager class - com.mapr.hadoop.yarn.security.MapRTicketManager

19/07/30 04:06:32 INFO impl.YarnClientImpl: Submitted application application_1563651888010_477141

19/07/30 04:06:32 INFO mapreduce.Job: The url to track the job: https://proxy/application_1563651888010_477141/

19/07/30 04:06:32 INFO mapreduce.Job: Running job: job_1563651888010_477141

19/07/30 04:06:49 INFO mapreduce.Job: Job job_1563651888010_477141 running in uber mode : false

19/07/30 04:06:49 INFO mapreduce.Job: map 0% reduce 0%

19/07/30 04:07:08 INFO mapreduce.Job: map 6% reduce 0%

19/07/30 04:17:18 INFO mapreduce.Job: Task Id : attempt_1563651888010_477141_m_000000_0, Status : FAILED

AttemptID:attempt_1563651888010_477141_m_000000_0 Timed out after 600 secs

19/07/30 04:17:19 INFO mapreduce.Job: map 0% reduce 0%

19/07/30 04:17:37 INFO mapreduce.Job: map 6% reduce 0%

19/07/30 04:27:48 INFO mapreduce.Job: Task Id : attempt_1563651888010_477141_m_000000_1, Status : FAILED

AttemptID:attempt_1563651888010_477141_m_000000_1 Timed out after 600 secs

19/07/30 04:27:49 INFO mapreduce.Job: map 0% reduce 0%

19/07/30 04:28:06 INFO mapreduce.Job: map 6% reduce 0%

19/07/30 04:38:18 INFO mapreduce.Job: Task Id : attempt_1563651888010_477141_m_000000_2, Status : FAILED

AttemptID:attempt_1563651888010_477141_m_000000_2 Timed out after 600 secs

Container killed by the ApplicationMaster.

Container killed on request. Exit code is 143

Container exited with a non-zero exit code 143


19/07/30 04:38:19 INFO mapreduce.Job: map 0% reduce 0%

19/07/30 04:38:44 INFO mapreduce.Job: map 6% reduce 0%

19/07/30 04:48:48 INFO mapreduce.Job: map 100% reduce 0%

19/07/30 04:48:48 INFO mapreduce.Job: Job job_1563651888010_477141 failed with state FAILED due to: Task failed task_1563651888010_477141_m_000000

Job failed as tasks failed. failedMaps:1 failedReduces:0


19/07/30 04:48:48 INFO mapreduce.Job: Counters: 10

Job Counters 

Failed map tasks=4

Launched map tasks=4

Other local map tasks=3

Rack-local map tasks=1

Total time spent by all maps in occupied slots (ms)=2513029

Total time spent by all reduces in occupied slots (ms)=0

Total time spent by all map tasks (ms)=2513029

Total vcore-seconds taken by all map tasks=2513029

Total megabyte-seconds taken by all map tasks=2573341696

DISK_MILLIS_MAPS=1256515

19/07/30 04:48:48 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead

19/07/30 04:48:48 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 2,539.3569 seconds (0 bytes/sec)

19/07/30 04:48:48 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead

19/07/30 04:48:48 INFO mapreduce.ExportJobBase: Exported 0 records.

19/07/30 04:48:48 ERROR tool.ExportTool: Error during export: Export job failed!

Aug 1, 2019 in Big Data Hadoop by Hemanth
• 250 points
edited Aug 1, 2019 by Omkar • 8,148 views

These logs are generic. We can't find the reason for failure with these. Can you post the error log for this application?

yarn logs -applicationId <application ID> <options>

commented Aug 1, 2019 by Shruthi

Hello Shruthi thanks for the reply. I couldn't get your question. How to get the error log ? i am very new can u guide me if possible ?

commented Aug 1, 2019 by Hemanth
• 250 points

8090/proxy/application_1563651888010_526091

is this the application id that u are asking

commented Aug 1, 2019 by Hemanth
• 250 points

Hi @Hemanth. When you are running the MapReduce job, run the following command in a new terminal:

yarn application -list -appStates RUNNING

This will print all the running applications along with their application ID. Now use this ID in the below command:

yarn logs -applicationId <application ID>

Now when the jobs fails, you'll the log telling why the job failed. Posting that here, will help me analyze the reason for failure.

commented Aug 1, 2019 by Shruthi

Hi @shruthi, i have tried like you suggested but it is giving me it is not present. Please check the error below.

[phodisvc@cstg-sa-pdi-prd-02 ~]$ yarn logs -applicationId application_1563651888010_526091
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
19/08/01 07:12:59 INFO client.MapRZKBasedRMFailoverProxyProvider: Updated RM address to hdprd-c01-r03-01.cisco.com/173.36.31.26:8032
Unable to get ApplicationState. Attempting to fetch logs directly from the filesystem.
/tmp/logs/phodisvc/logs/application_1563651888010_526091 does not exist.
Log aggregation has not completed or is not enabled.

commented Aug 1, 2019 by Hemanth
• 250 points

Are you checking for the logs as the same user that started the application?

commented Aug 2, 2019 by Shruthi

@shruthi, HI I have checked with this command:

yarn application -list -appStates RUNNING

it is showing so many jobs but the one i am looking is not there. So my question is those will be available only when it is running time after that we can't able to check the logs. Please Suggest. Thanks

commented Aug 2, 2019 by Hemanth
• 250 points
edited Aug 2, 2019 by Omkar

@Hemanth. Only the jobs that are running or have finished will be present here. Because your job is not completing, it won't be finished. So to get the logs, you will have to get it when it is running.

commented Aug 2, 2019 by Shruthi

@shruthi i got it now i will run it and get the log and will post here.

commented Aug 2, 2019 by Hemanth
• 250 points

Sure, let me know.

commented Aug 2, 2019 by Shruthi

@Shruthi, just started the job it's been 10 min still more 40 min to go.

yarn application -list -appStates RUNNING i use this one and i can see my job running but when i do

yarn logs -applicationId <application ID> this is giving same error like previous it is not present.

commented Aug 2, 2019 by Hemanth
• 250 points

Hi @Hemanth,

Will you check your /etc/hadoop/conf/yarn-site.xml and do ensure the following two parameters are set:

 <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>

  <property>
    <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
    <value>org.apache.hadoop.mapred.ShuffleHandler</value>
  </property>

And don't forget to restart the yarn services.

commented Aug 2, 2019 by Rashi

@shruthi, i got the yarn log.. can u please check

]$ yarn logs -applicationId application_1563651888010_628876
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/hadoop/hadoop-2.7.0/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
19/08/02 06:31:32 ERROR client.MapRZKRMFinderUtils: Zookeeper address not found from MapRFilesystem. Will try the configuration from yarn-site.xmljava.io.IOException: failure to login: javax.security.auth.login.LoginException: Unable to obtain MapR credentials
19/08/02 06:31:32 ERROR client.MapRZKRMFinderUtils: Zookeeper address not found from MapRFilesystem. Will try the configuration from yarn-site.xml
19/08/02 06:31:32 ERROR client.MapRZKRMFinderUtils: Zookeeper address can not be retrieved. Trying backup zk address
19/08/02 06:31:32 ERROR client.MapRZKRMFinderUtils: Zookeeper address not configured in Yarn configuration. Please check yarn-site.xml.
19/08/02 06:31:32 ERROR client.MapRZKRMFinderUtils: Unable to determine ResourceManager service address from Zookeeper.
19/08/02 06:31:32 ERROR client.MapRZKBasedRMFailoverProxyProvider: Unable to create proxy to the ResourceManager null
java.lang.RuntimeException: Zookeeper address not found from MapR Filesystem and is also not configured in Yarn configuration.
        at org.apache.hadoop.yarn.client.MapRZKRMFinderUtils.mapRZkBasedRMFinder(MapRZKRMFinderUtils.java:99)
        at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.updateCurrentRMAddress(MapRZKBasedRMFailoverProxyProvider.java:65)
        at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.getProxy(MapRZKBasedRMFailoverProxyProvider.java:132)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:73)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.<init>(RetryInvocationHandler.java:64)
        at org.apache.hadoop.io.retry.RetryProxy.create(RetryProxy.java:58)
        at org.apache.hadoop.yarn.client.RMProxy.createRMProxy(RMProxy.java:95)
        at org.apache.hadoop.yarn.client.ClientRMProxy.createRMProxy(ClientRMProxy.java:73)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStart(YarnClientImpl.java:203)
        at org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.createYarnClient(LogsCLI.java:185)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.verifyApplicationState(LogsCLI.java:157)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:122)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:193)
19/08/02 06:31:32 ERROR ipc.RPC: RPC.stopProxy called on non proxy: class=com.sun.proxy.$Proxy4
org.apache.hadoop.HadoopIllegalArgumentException: Cannot close proxy since it is null
        at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:657)
        at org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider.close(MapRZKBasedRMFailoverProxyProvider.java:157)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.close(RetryInvocationHandler.java:207)
        at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:667)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStop(YarnClientImpl.java:220)
        at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        at org.apache.hadoop.service.AbstractService.close(AbstractService.java:250)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.verifyApplicationState(LogsCLI.java:176)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:122)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:193)
19/08/02 06:31:32 INFO service.AbstractService: Service org.apache.hadoop.yarn.client.api.impl.YarnClientImpl failed in state STOPPED; cause: org.apache.hadoop.HadoopIllegalArgumentException: Cannot close proxy - is not Closeable or does not provide closeable invocation handler class com.sun.proxy.$Proxy4
org.apache.hadoop.HadoopIllegalArgumentException: Cannot close proxy - is not Closeable or does not provide closeable invocation handler class com.sun.proxy.$Proxy4
        at org.apache.hadoop.ipc.RPC.stopProxy(RPC.java:680)
        at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceStop(YarnClientImpl.java:220)
        at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:221)
        at org.apache.hadoop.service.AbstractService.close(AbstractService.java:250)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.verifyApplicationState(LogsCLI.java:176)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:122)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:193)
Unable to get ApplicationState. Attempting to fetch logs directly from the filesystem.
Exception in thread "main" java.io.IOException: failure to login: javax.security.auth.login.LoginException: Unable to obtain MapR credentials
        at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:751)
        at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:688)
        at org.apache.hadoop.security.UserGroupInformation.getCurrentUser(UserGroupInformation.java:572)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.run(LogsCLI.java:136)
        at org.apache.hadoop.yarn.client.cli.LogsCLI.main(LogsCLI.java:193)
Caused by: javax.security.auth.login.LoginException: Unable to obtain MapR credentials
        at com.mapr.security.maprsasl.MaprSecurityLoginModule.login(MaprSecurityLoginModule.java:228)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at javax.security.auth.login.LoginContext.invoke(LoginContext.java:755)
        at javax.security.auth.login.LoginContext.access$000(LoginContext.java:195)
        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:682)
        at javax.security.auth.login.LoginContext$4.run(LoginContext.java:680)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.login.LoginContext.invokePriv(LoginContext.java:680)
        at javax.security.auth.login.LoginContext.login(LoginContext.java:587)
        at org.apache.hadoop.security.UserGroupInformation.loginUserFromSubject(UserGroupInformation.java:724)
        ... 4 more
Caused by: com.mapr.login.MapRLoginException: Unable to authenticate as ticket is not available
        at com.mapr.login.client.MapRLoginHttpsClient.authenticateIfNeeded(MapRLoginHttpsClient.java:173)
        at com.mapr.login.client.MapRLoginHttpsClient.authenticateIfNeeded(MapRLoginHttpsClient.java:115)
        at com.mapr.security.maprsasl.MaprSecurityLoginModule.login(MaprSecurityLoginModule.java:222)
        ... 16 more

commented Aug 2, 2019 by Hemanth
• 250 points

@Hemanth, the error tells

Zookeeper address not found from MapR Filesystem and is also not configured in Yarn configuration.

Configure Zookeeper and configure it in yarn. It should work.

commented Aug 2, 2019 by Shruthi

@shruthi Thanks a lot i will do as u suggested will keep u posted after that

commented Aug 2, 2019 by Hemanth
• 250 points

@shruthi, HI my yarn-site.xml looks like this. What else i need to configure

<configuration>
  <!-- Resource Manager MapR HA Configs -->
  <property>
    <name>yarn.resourcemanager.ha.custom-ha-enabled</name>
    <value>true</value>
    <description>MapR Zookeeper based RM Reconnect Enabled. If this is true, set the failover proxy to be the class MapRZKBasedRMFailoverProxyProvider</description>
  </property>
  <property>
    <name>yarn.client.failover-proxy-provider</name>
    <value>org.apache.hadoop.yarn.client.MapRZKBasedRMFailoverProxyProvider</value>
    <description>Zookeeper based reconnect proxy provider. Should be set if and only if mapr-ha-enabled property is true.</description>
  </property>
  <property>
    <name>yarn.resourcemanager.recovery.enabled</name>
    <value>true</value>
    <description>RM Recovery Enabled</description>
  </property>
  <property>
   <name>yarn.resourcemanager.ha.custom-ha-rmaddressfinder</name>
   <value>org.apache.hadoop.yarn.client.MapRZKBasedRMAddressFinder</value>
  </property>

  <property>
    <name>yarn.acl.enable</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.admin.acl</name>
    <value> </value>
  </property>

  <!-- :::CAUTION::: DO NOT EDIT ANYTHING ON OR ABOVE THIS LINE -->
</configuration>

commented Aug 5, 2019 by Hemanth
• 250 points
edited Aug 5, 2019 by Omkar

HI @shruthi, once confusion here i have checked my failed job and success jobs. I was checking yarn logs for each job_ID's. both success and failed jobs all showing same exception pasted above "Zookeeper address not found from MapR Filesystem and is also not configured in Yarn configuration".

Success Jobs application_ID's giving same log. So that means this is not the issue.

we have daily job that is running successfully, so new requirement came i have done some changes and i have created seperate shell script file. when i testing that requirement i am facing this issue. The daily job is normally running and success. So I might thinking this would be code issue. If it is code issue, how to make sure to get clarify on this and also i just want to put debug logger at present we are getting INFO only. If possible can u let me know location of log4j.prop file.

TIA

commented Aug 5, 2019 by Hemanth
• 250 points

Hi @Hemant,

As your error says:

Zookeeper address not found from MapR Filesystem and is also not configured in Yarn configuration.

By default, the Resource Manager stores its state in the MapR-FS, to configure the ResourceManager to use the Zookeeper:

Set the value of

yarn.resourcemanager.store.class to org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore in the yarn-site.xml.

Set the value of

yarn.resourcemanager.zk-address

to a comma-separated list of host:port pairs for each ZooKeeper server used by the ResourceManager. This property needs to be set in yarn-site.xml. These hosts are used by the ResourceManager to store state.