Mapreduce MultipleOutputs error -
i want store output of mapreduce job in 2 different directories. eventhough code designed store same output in different directories.
my driver class code below
public class wordcountmain { public static void main(string[] args) throws exception { configuration conf = new configuration(); job myhadoopjob = new job(conf); myhadoopjob.setjarbyclass(wordcountmain.class); myhadoopjob.setjobname("word count job"); fileinputformat.addinputpath(myhadoopjob, new path(args[0])); myhadoopjob.setmapperclass(wordcountmapper.class); myhadoopjob.setreducerclass(wordcountreducer.class); myhadoopjob.setinputformatclass(textinputformat.class); myhadoopjob.setoutputformatclass(textoutputformat.class); myhadoopjob.setmapoutputkeyclass(text.class); myhadoopjob.setmapoutputvalueclass(intwritable.class); myhadoopjob.setoutputkeyclass(text.class); myhadoopjob.setoutputvalueclass(intwritable.class); multipleoutputs.addnamedoutput(myhadoopjob, "output1", textoutputformat.class, text.class, intwritable.class); multipleoutputs.addnamedoutput(myhadoopjob, "output2", textoutputformat.class, text.class, intwritable.class); fileoutputformat.setoutputpath(myhadoopjob, new path(args[1])); system.exit(myhadoopjob.waitforcompletion(true) ? 0 : 1); }
}
my mapper code
public class wordcountmapper extends mapper<longwritable, text, text, intwritable> { @override protected void map(longwritable key, text value, context context)throws ioexception, interruptedexception { string line = value.tostring(); string word =null; stringtokenizer st = new stringtokenizer(line,","); while(st.hasmoretokens()) { word= st.nexttoken(); context.write(new text(word), new intwritable(1)); } }
}
my reducer code below
public class wordcountreducer extends reducer<text, intwritable, text, intwritable> { multipleoutputs mout =null; protected void reduce(text key, iterable<intwritable> values, context context)throws ioexception, interruptedexception { int count=0; int num =0; iterator<intwritable> ie =values.iterator(); while(ie.hasnext()) { num = ie.next().get();//1 count= count+num; } mout.write("output1", key, new intwritable(count)); mout.write("output2", key, new intwritable(count)); @override protected void setup(org.apache.hadoop.mapreduce.reducer.context context) throws ioexception, interruptedexception { // todo auto-generated method stub super.setup(context); mout = new multipleoutputs<text, intwritable>(context); } } @override protected void setup(org.apache.hadoop.mapreduce.reducer.context context) throws ioexception, interruptedexception { super.setup(context); mout = new multipleoutputs<text, intwritable>(context); }
}
i giving output directories in reduce method itself
but when run mapreduce job using below command, nothing. mapreduce not @ started. blank , stays idle.
hadoop jar wordcountmain.jar /user/cloudera/inputfiles/words.txt /user/cloudera/outputfiles/mapreduce/multipleoutputs
could explain me went wrong , how correct code
actually happens 2 output files different name stored inside /user/cloudera/outputfiles/mapreduce/multipleoutputs.
but need storing output files in different directories.
in pig can use 2 store statement giving different directories
how achieve same in mapreduce
can try closing multiple output object in cleanup method reducer.
Comments
Post a Comment