mapreduce - Hadoop: Getting the input file name in the mapper only once -


i new in hadoop , working on hadoop. have small query.

i have around 10 files in input folder need pass map reduce program. want file name in mapper filename contains time @ file got created. saw people using filesplit file name in mapper. if let input files contains million of lines every time mapper code called, file name , extract time file, obvious repeated time consuming thing same file. once time in mapper not have again , again assign time file.

how can achieve this?

you use mapper's setup method filename, setup method gaurenteed run once before map() method gets initialized this:

public class mapperrsj extends mapper<longwritable, text, compositekeywritablersj, text> {   string filename;    @override   protected void setup(context context) throws ioexception, interruptedexception {     filesplit fsfilesplit = (filesplit) context.getinputsplit();     filename = context.getconfiguration().get(fsfilesplit.getpath().getparent().getname()));   }    @override   public void map(longwritable key, text value, context context) throws ioexception, interruptedexception {     // process each key value pair   } } 

Comments

Popular posts from this blog

c++ - QTextObjectInterface with Qml TextEdit (QQuickTextEdit) -

javascript - angular ng-required radio button not toggling required off in firefox 33, OK in chrome -

xcode - Swift Playground - Files are not readable -