apache spark - Understanding MultiFileWordCount example in Hadoop -
i'm trying understand multifilewordcount example in order implement combinefileinputsplit in hadoop.
most of easy except 2 lines i'm finding confusing
i'm not clear why skipfirstline needed , why offset decreased 1.
this same thing used in post ibm well. here appreciated.
Comments
Post a Comment