apache spark - Understanding MultiFileWordCount example in Hadoop -


i'm trying understand multifilewordcount example in order implement combinefileinputsplit in hadoop.

most of easy except 2 lines i'm finding confusing

https://github.com/hanborq/hadoop/blob/master/src/examples/org/apache/hadoop/examples/multifilewordcount.java#l142

i'm not clear why skipfirstline needed , why offset decreased 1.

this same thing used in post ibm well. here appreciated.


Comments

Popular posts from this blog

sql - can we replace full join with union of left and right join? why not? -

javascript - Parallax scrolling and fixed footer code causing width issues -

iOS: Performance of reloading UIImage(name:...) -