Here, we have two tables:
- Tab1 having columns id, name and age
- Tab2 having columns id, name and email
Using the below command to load data in pig,
tab1 = load ‘/mnt/home/edureka_425640/pig_join_1.txt’ using PigStorage(‘,’) as (id:int,name:chararray,age:int)
Dump tab1;
tab2 = load ‘/mnt/home/edureka_425640/pig_join_2.txt’ using PigStorage(‘,’) as (id:int,name:chararray,emal:chararrray)
Dump tab2;
Now, joining two tables on two columns
The below is the output:
Hope this helps you.