Friday, 13 September 2013

Perform count on similar values in using Pig for multiple line of dataset

Perform count on similar values in using Pig for multiple line of dataset

I am new in PIG and trying to solve a problem on wordcount (website) for
multiple line of input(websites). For example my input dataset has the
value
Input data
Email websites
e1 web1 web2 web3 web1 ....
e2 web2 web3 web2 web2 web4 ...
e3 web1 web2 web1 web4 .....
and my desired output will be
Email websites
e1 web1(2) web2(1) web3(1) ....
e2 web2(3) web3(1) web4(1) ...
e3 web1(2) web2(1) web4(1) .....
In my dataset i have almost 50000 email id(user)

No comments:

Post a Comment