apache spark - Understanding MultiFileWordCount example in Hadoop -
i'm trying understand multifilewordcount example in order implement combinefileinputsplit in hadoop. most of easy except 2 lines i'm finding confusing https://github.com/hanborq/hadoop/blob/master/src/examples/org/apache/hadoop/examples/multifilewordcount.java#l142 i'm not clear why skipfirstline needed , why offset decreased 1. this same thing used in post ibm well. here appreciated.