Apache Spark and Scala (36 Blogs) Become a Certified Professional

Hive and Yarn Examples on Spark

Last updated on May 22,2019 14.1K Views

Awanish
Awanish is a Sr. Research Analyst at Edureka. He has rich expertise... Awanish is a Sr. Research Analyst at Edureka. He has rich expertise in Big Data technologies like Hadoop, Spark, Storm, Kafka, Flink. Awanish also...

We have learnt how to Build Hive and Yarn on Spark. Now let us try out Hive and Yarn examples on Spark.

Learn-Spark-Now

Hive Example on Spark

We will run an example of Hive on Spark. We will create a table, load data in that table and execute a simple query. When working with Hive, one must construct a HiveContext which inherits from SQLContext.

Command: cd spark-1.1.1

Command: ./bin/spark-shell

hive-and-yarn-practicals-on-spark-7

Create an input file ‘sample’ in your home directory as below snapshot (tab separated).

hive-and-yarn-practicals-on-spark-8

Command: val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

hive-and-yarn-practicals-on-spark-9

Command: sqlContext.sql(“CREATE TABLE IF NOT EXISTS test (name STRING, rank INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY ‘ ‘ LINES TERMINATED BY ‘
‘”)

hive-and-yarn-practicals-on-spark-10

Command: sqlContext.sql(“LOAD DATA LOCAL INPATH ‘/home/edureka/sample’ INTO TABLE test”)

hive-and-yarn-practicals-on-spark-11

Command: sqlContext.sql(“SELECT * FROM test WHERE rank < 5”).collect().foreach(println)

hive-and-yarn-practicals-on-spark-12

Yarn Example on Spark

We will run SparkPi example on Yarn. We can deploy Yarn on Spark in two modes : cluster mode and client mode. In yarn-cluster mode, the Spark driver runs inside an application master process which is managed by Yarn on the cluster, and the client can go away after initiating the application. In yarn-client mode, the driver runs in the client process, and the application master is only used for requesting resources from Yarn.

Command: cd spark-1.1.1

Command: SPARK_JAR=./assembly/target/scala-2.10/spark-assembly-1.1.1-hadoop2.2.0.jar ./bin/spark-submit –master yarn –deploy-mode cluster –class org.apache.spark.examples.SparkPi –num-executors 1 –driver-memory 2g –executor-memory 1g –executor-cores 1 examples/target/scala-2.10/spark-examples-1.1.1-hadoop2.2.0.jar

hive-and-yarn-practicals-on-spark-1

After you execute the above command, please wait for sometime till you get SUCCEEDED message.

hive-and-yarn-practicals-on-spark-2

Browse localhost:8088/cluster and click on the Spark application.

hive-and-yarn-practicals-on-spark-3

Click on logs.

hive-and-yarn-practicals-on-spark-4

Click on stdout to check the output.

hive-and-yarn-practicals-on-spark-5

hive-and-yarn-practicals-on-spark-6

For deploying Yarn on Spark in client mode, just make –deploy-mode as “client”. Now, you know how to build Hive and Yarn on Spark. We also did practicals on them.

Got a question for us? Please mention them in the comments section and we will get back to you.

Related Posts

Apache Spark Lighting up the Big Data World

Apache Spark with Hadoop-Why it matters?

Hive & Yarn Get Electrified By Spark

Start your Training in Apache Spark & Scala Today

Comments
2 Comments
  • Hi i got an error. Whats this error? how to overcome?

    Caused by: org.datanucleus.exceptions.NucleusException: Attempt to invoke the “dbcp-builtin” plugin to create a ConnectionPool gave an error : The specified datastore driver (“com.mysql.jdbc.Driver”) was not found in the CLASSPATH. Please check your CLASSPATH specification, and the name of the driver.

    • Hi Venu, It is not able to find the jdbc driver class. Add it in your driver classpath as follows:

      ./bin/spark-sql –driver-class-path /path/to/connector……jar

      The connection jar will depend on database that has been set as metascore in hive.

      If it MySql then the command would be:

      /bin/spark-sql –driver-class-path /home/edureka/spark-1.1.1/lib/mysql-connector-java-5.1.32-bin.jar

Join the discussion

Browse Categories

webinar REGISTER FOR FREE WEBINAR
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP

Subscribe to our Newsletter, and get personalized recommendations.

image not found!
image not found!

Hive and Yarn Examples on Spark

edureka.co