hadoop - Apache Pig: filter based on tupple member content -


i'm learning apache pig , have encountered issue realise wish. i've object (after doing group by):

mlset_1: {group chararray,mlset: {(key: chararray, text: chararray)}} 

i'd generate key when pattern (pattern_a) appears in text , pattern (pattern_b) not appear in text field 1 key.

i know can use mlset.text tupple of text values specific key i'm still having same issue on how filter on list of items tuple.

here's example:

(key_a,{(key_a,start),(key_a,stop),(key_a,unknown),(key_a,whatever)}) (key_b,{(key_b,stop),(key_b,whatever)}) (key_c,{(key_c,start),(key_c,stop),(key_c,whatever)}) 

i'd keys lines "start" appears , "unknown" not appears. in example key_c result.

thanks in advance !

here's code might out. solution nested foreach here:

c = foreach mlset_1 {f1 = filter mlset (text == pattern_a); f2 = filter mlset (text != pattern_b); generate group, count(f1) cnt1, count(f2) cnt2;}; d = filter c (cnt1 > 1 , cnt2 == 0); 

you'll have adapt comparison in nested filter.


Comments

Popular posts from this blog

c++ - QTextObjectInterface with Qml TextEdit (QQuickTextEdit) -

javascript - angular ng-required radio button not toggling required off in firefox 33, OK in chrome -

xcode - Swift Playground - Files are not readable -