Microsoft Azure Data Engineering Certificatio ...
- 14k Enrolled Learners
- Weekend/Weekday
- Live Class
This post contains the necessary step required to create UDF in Apache Pig. All UDF should extend a Filter function and has to contain a method called exec, which contains a Tuple. The logic applied here is that if the Tuple is null or zero, it will give you a Boolean value: True or False. And ‘IsofAge’ is for checking if the age given is correct or not. The logic of the User Defined Function is written in Java codes, where the JAR file will be created and then exported. The JAR file is later on registered. These JAR files are found in the library files of Apache Pig at the time of loading.
public class IsOfAge extends FilterFunc { @Override publicBoolean exec(Tuple tuple) throwsIOException { if(tuple == null|| tuple.size() == 0) { returnfalse; } try{ Object object= tuple.get(0); if(object == null) { returnfalse; } inti = (Integer) object; if(i == 18 || i == 19 || i == 21 || i == 23 || i == 27) { returntrue; } else{ returnfalse; } } catch(ExecExceptione) { thrownewIOException(e); } } }
Once a UDF is created, the following command has to be used to register the JAR file.
register myudf.jar; X = filter A by IsOfAge(age);
There are multiple predefined functions in Apache Pig. We also have the feature to create our own function that is User Defined Function (UDF). Pig UDF is written in Java and this requires Pig Library to use the predefined classes. The Apache Pig library pig-0.8.0-cdh3u0-core.jar can be downloaded from internet.
Click here for steps for creating a Pig script with UDF in HDFS Mode.
Embark on a transformative journey into the world of data engineering and unlock the power of data with our Data Engineer Courses.
Take your data analysis skills to the next level with our cutting-edge Big Data Course.
Got a question for us? Mention them in the comments section and we will get back to you.
Related Posts:
Apache Pig Script With UDF in HDFS Mode
Operators in Apache Pig: Part 1- Relational Operators
edureka.co
very nice and informative blog sir i appreciate your efforts . i have one question how to use hive internal table schema and data in pig script
Hey Aamir, thanks for checking out our blog. You can use hive internal table schema and data into pig using hcatalog. Please follow the below given link for further reference:
http://www.javachain.com/load-and-store-from-hive-table-to-pig
Hope this helps. Cheers!