How to get Spark dataset metadata

0 votes
I want to convert a dataset into some other object. I need the metadata of the dataset before converting it. So is there any function in spark that can help?

Thanks in advance.
Apr 26, 2018 in Apache Spark by Ashish
• 2,650 points
4,885 views

1 answer to this question.

0 votes
There are a bunch of functions that can help you here.\

For the schema of dataset ds, you can use ds.schema

you have ds.schema.size, ds.schema.fields, ds.schema.fieldNames

Also, you can see the details with ds.printSchema()

Hope this helps
answered Apr 26, 2018 by kurt_cobain
• 9,350 points

Related Questions In Apache Spark

0 votes
1 answer

How to get ID of a map task in Spark?

you can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

answered Nov 20, 2018 in Apache Spark by Frankie
• 9,830 points
3,454 views
0 votes
1 answer

How to get Spark SQL configuration?

First create a Spark session like this: val ...READ MORE

answered Mar 18, 2019 in Apache Spark by John
3,559 views
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18, 2019 in Apache Spark by John
1,253 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Jan 1, 2019 in Apache Spark by anonymous
19,910 views
0 votes
1 answer
+1 vote
2 answers

Execute Pig Script from Grunt Shell

From your current directory run  pig -x local Then ...READ MORE

answered Oct 25, 2018 in Big Data Hadoop by Kunal
6,228 views
0 votes
1 answer
0 votes
2 answers

ansible-command not found

Use some other variable instead of PATH. READ MORE

answered Apr 23, 2019 in Ansible by Vismaya
10,887 views
0 votes
1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

answered Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,350 points
5,555 views
0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,350 points
2,211 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP