Irene Lee (Georgia Tech), Gyewon Lee(Seoul National University), Byung-Gon Chun.
Image classification is a popular application of deep neural networks. Data augmentation is often used to train models for such task in order to reduce estimation error from limited number of data and to increase generalization of trained models. However, due to the non-deterministic nature of data augmentation, preprocessed data is not reusable, leaving the input pipeline overhead unavoidable for every iteration of training. In this work, we analyze the impact of input pipeline overhead on training throughput of image classification models in various hardware settings. The key observations of the experiment are that data augmentation degrades training speed, and that the degradation becomes worse as 1) the model being trained gets smaller, 2) the CPU-GPU ratio decreases, and 3) the number of augmentation layers increases.