apache spark - Understanding MultiFileWordCount example in Hadoop -


i'm trying understand multifilewordcount example in order implement combinefileinputsplit in hadoop.

most of easy except 2 lines i'm finding confusing

https://github.com/hanborq/hadoop/blob/master/src/examples/org/apache/hadoop/examples/multifilewordcount.java#l142

i'm not clear why skipfirstline needed , why offset decreased 1.

this same thing used in post ibm well. here appreciated.


Comments

Popular posts from this blog

php - How to add and update images or image url in Volusion using Volusion API -

Laravel mail error `Swift_TransportException in StreamBuffer.php line 269: Connection could not be established with host smtp.gmail.com [ #0]` -

c# SetCompatibleTextRenderingDefault must be called before the first -