python - How to manage data batches for big neural network? -
i preparing learn quite big neural network (fc, conv, pool, etc.) millions of small images ( ~~100x100 px, 3 channels each) in keras. files around ~~800 gb , there question. how should perpare data?
i know keras handles batches better learn network either 100 files 8 gb each or create ~~300k files (in each merged 32 or 64 images)? think better have bigger files , faster read them 8 times (8 big files) 300k times not sure.
i have got less 100 gb ram can not load whole data @ once sure.
thanks!
you can use keras.preprocessing.image.imagedatagenerator
provided keras instead of loading files memory. allows set batch size. imagedatagenerator
can augment data in real-time "for free" if need. since takes time train network using batch of images, reading files hard drive doesn't slow down performance. main bottleneck computational power.
the interface , examples of keras.preprocessing.image.imagedatagenerator
can found @ imagedatagenerator
Comments
Post a Comment