apache spark - Understanding MultiFileWordCount example in Hadoop -


i'm trying understand multifilewordcount example in order implement combinefileinputsplit in hadoop.

most of easy except 2 lines i'm finding confusing

https://github.com/hanborq/hadoop/blob/master/src/examples/org/apache/hadoop/examples/multifilewordcount.java#l142

i'm not clear why skipfirstline needed , why offset decreased 1.

this same thing used in post ibm well. here appreciated.


Comments

Popular posts from this blog

c# SetCompatibleTextRenderingDefault must be called before the first -

How to prevent logback from emitting repeated "Empty watch file list. Disabling" messages -

C#.NET Oracle.ManagedDataAccess ConfigSchema.xsd -