Pig Programming | Apache Pig Script with UDF in HDFS Mode

import java.io.IOException; import org.apache.pig.EvalFunc; import org.apache.pig.data.Tuple; import org.apache.pig.impl.util.WrappedIOException; @SuppressWarnings("deprecation") public class Upper extends EvalFunc<String> { public String exec(Tuple input) throws IOException { if (input == null || input.size() == 0) return null; try { String str = (String) input.get(0); str=str.toUpperCase(); return str; } catch (Exception e) { throw WrappedIOException.wrap("Caught exception processing input row ", e); } } }

import java.io.IOException; import org.apache.pig.EvalFunc; import org.apache.pig.PigWarning; import org.apache.pig.data.Tuple; public class Pow extends EvalFunc<Long> { public Long exec(Tuple input) throws IOException { try { int base = (Integer)input.get(0); int exponent = (Integer)input.get(1); long result = 1; /* Probably not the most efficient method...*/ for (int i = 0; i < exponent; i++) { long preresult = result; result *= base; if (preresult > result) { // We overflowed. Give a warning, but do not throw an // exception. warn("Overflow!", PigWarning.TOO_LARGE_FOR_INT); // Returning null will indicate to Pig that we failed but // we want to continue execution. return null; } } return result; } catch (Exception e) { // Throwing an exception will cause the task to fail. throw new IOException("Something bad happened!", e); } } }

pavan says:
Jun 3, 2018 at 3:33 am GMT
script.pig code please..
Reply
aamir says:
Jan 5, 2017 at 10:38 am GMT
How to modify the RDBMs’ Nested SQL queries into Hadoop framework using Pig.
Reply
- EdurekaSupport says:
  Jan 9, 2017 at 10:57 am GMT
  Hey Aamir, thanks for checking out our blog. You can write a query in SQL and then translate it into Pig Latin as
  Syntax:
  WHERE → FILTER
  The syntax is different, but conceptually this is still putting your data into a funnel to create a smaller dataset.
  HAVING → FILTER
  Because a FILTER is done in a separate step from a GROUP or an aggregation, the distinction between HAVING and WHERE doesn’t exist in Pig.
  ORDER BY → ORDER
  This keyword behaves pretty much the same in Pig as in SQL.
  Example:
  AVERAGE SALARY BY LOCATION
  SQL
  SELECT loc, AVG(sal) FROM emp JOIN dept USING(deptno) WHERE sal >3000 GROUP BY loc;
  PIG LATIN
  filtered_emp = FILTER emp BY sal > 3000;
  emp_join_dept = JOIN filtered_emp BY deptno, dept BY deptno;
  grouped_by_loc = GROUP emp_join_dept BY loc;
  avg_salary = FOREACH grouped_by_loc GENERATE group, AVG(emp_join_dept.sal);
  Hope this helps. Cheers!
  Reply
ms says:
Jun 13, 2015 at 2:45 am GMT
2015-06-13 08:15:01,140 [main] ERROR org.apache.pig.tools.grunt.Grunt – ERROR 1070: Could not resolve MyUpper using imports: [, java.lang., org.apache.pig.builtin., org.apache.pig.impl.builtin.]
Reply
- ms says:
  Jun 13, 2015 at 2:46 am GMT
  Im getting error when i try to register my udf and execute the pig script
  Reply
  - ms says:
    Jun 13, 2015 at 3:15 am GMT
    I created the UDF inside a package org.ms.pig.udf.xxx and this is the root cause…Looks like the simpe register function of JAR will not help in this case…Could you please help me with command to register the UDF if its in a package
    Reply
    - EdurekaSupport says:
      Jul 6, 2015 at 6:49 am GMT
      Hi Ms, Let us assume I have placed Upper class is in a package pig.udf.examples, while registering the jar the command would be same. register name_of_jar;
      But while using the UDF/class we need to give package name of that class as well.
      E.g: B = foreach A generate pig.udf.examples.Upper(f1);
      Reply
      - Ananth N says:
        May 19, 2016 at 3:32 pm GMT
        Generally classes are written inside specific packages only. We never write in default package.
        That being the case, we must always refer the UDF by fully qualified name of class.
        But the built-in functions are referable by unqualified class name only. This makes code more readable. I think, the default pig engine searches for function classes in some standard packages java.lang., org.apache.pig.builtin., etc. Is there a way to similarly refer UDF’s without package name? May be by declaring our own package in list of standard packages?
utkarsh says:
Mar 28, 2015 at 12:11 am GMT
Your jar wont be successful because, .class files for Upper are missing.
Reply
- utkarsh says:
  Mar 28, 2015 at 12:13 am GMT
  jar creation will throw this error :
  class file fpr Upper.java not found in lasspath
  Reply
  - Dilip Kumar says:
    Apr 21, 2015 at 5:20 pm GMT
    Am getting that error. How to resolve that issue. ??
    Reply
    - EdurekaSupport says:
      Apr 22, 2015 at 5:15 am GMT
      Hi Dilip, try adding pig-0.8.0-cdh3u0-core.jar in the Java Project and try it out again.
      Reply
      - Dilip Kumar says:
        Apr 23, 2015 at 3:44 pm GMT
        Hi,
        Am not able to find the link to download the file that you have mentioned. Can you please provide the link to download that file. Am trying out many options to run my udf. But got stuck here. Kindly provide me the soultion asap. Thanks in advance!!
      - EdurekaSupport says:
        May 7, 2015 at 9:18 am GMT
        Hi Dilip, we can add any Pig library jar file to our project to run UDF program. Use the following link to download the required jar file.
        “https://edureka.wistia.com/medias/uovcf7gcyt/download?media_file_id=76020493”
  - EdurekaSupport says:
    Apr 22, 2015 at 5:14 am GMT
    Hi Utkarsh, you might have missed to add pig-0.8.0-cdh3u0-core.jar in the Java Project. Trying adding it again and creating.
    Reply
Wasim says:
Sep 29, 2014 at 11:49 pm GMT
This program also required additional jar files(hadoop_commons and commons_logging), also compilation error in java program. Please correct
Reply
- EdurekaSupport says:
  Oct 10, 2014 at 4:56 am GMT
  Hi Wasim, as we have used just Pig API here, we don’t need to add the other jars for this UDF.
  Adding the extra jars will depend on the API used in the UDF written.
  You might be receiving the compilation error, if Pig-0.8.0-cdh3u0-core.jar is not added to that project properly. Check if you have selected this jar in the Order and Exports tab of Configuring Build Path and then try it once again.
  Reply